Use the Qemu guest agent with Memory Overcommitment Manager

Qemu now has an official guest agent. Programs on a KVM host can now talk to guests using a virtual hardware channel (either virtio-serial or an emulated ISA serial port). Using this mechanism, it is now possible to reliably power down a guest, read and write files, and perform filesystem snapshotting. More information about the guest agent is here.

As of today, Memory Overcommitment Manager (MOM) can use the qemu guest agent for guest statistics collection. This has several key advantages:

  • No need for host->guest network connectivity
  • No need to map guest names to guest IP addresses
  • Fast and secure hardware channel
  • Uses a standard guest agent

Configuring MOM to use the qemu guest agent involves a few simple steps on the host and guest:

Host setup

1. Update MOM

Download the latest version of MOM from github and install it on your host.

2. Reconfigure MOM

  • Remove any existing name-to-ip setting from mom.conf
  • Remove GuestNetworkDaemon from the list of guest collectors (if present)
  • Add GuestQemuAgent to the list of guest collectors

2. Modify your guest domain.xml files to create the communication channels:

Place the following stanza into the devices section of each domain configuration file:

<serial type='unix'>
  <source mode='bind' path='/var/lib/libvirt/qemu/va-$NAME-isa.sock'/>
  <target port='1'/>
</serial>
<channel type='unix'>
  <source mode='bind' path='/var/lib/libvirt/qemu/va-$NAME-virtio.sock'/>
  <target type='virtio' name='org.qemu.guest_agent.0'/>
  <address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>

Replace $NAME with the guest’s name. Redefine/restart the domains to ensure the changes take effect. This will require a VM restart. Note: In this example we create both an ISA-serial channel and a virtio-serial channel. The MOM Collector will automatically connect to whichever of these channels is active. Always specifying both allows you to have a standardized libvirt XML configuration regardless of underlying guest support.

Guest setup

1. Build the guest agent binary on your guest

The guest agent is supported in qemu versions v0.15.0-rc0-413-g957f1f9 or later. Download and build the qemu source on your guest and find the binary ‘qemu-ga’.

2. Configure the guest agent to start automatically

The agent can be started on a guest that supports virtio-serial with no arguments. Follow your distribution’s recommendations to start the agent automatically at boot. If you were previously using the mom-guestd agent, you can use the same process for autostarting qemu-ga.

Testing it out

Assuming the above has gone ok, when you start up MOM on the host and start some guests, the mom log should show messages like “GuestMonitor-$NAME is ready” and your statistics should appear as usual.

Advertisements

About aglitke

I am a software engineer working on Linux, open source software, and virtualization. I am proud to work at Red Hat on oVirt and Red Hat Virtualization with a focus on software defined storage. Other notable projects I have been involved in include: The Linux ppc64 architecture, Linux kernel crash dumps (kdump), Linux huge pages and libhugetlbfs, qemu, libvirt, and the Memory Overcommitment Manager.
This entry was posted in KVM, libvirt, MOM, qemu. Bookmark the permalink.

37 Responses to Use the Qemu guest agent with Memory Overcommitment Manager

  1. Henrik Uggla says:

    Hi!
    I’ve now tested this approach and I still can’t make it work. Here’s my log:

    2013-05-22 10:03:37,393 – mom – INFO – MOM starting
    2013-05-22 10:03:37,394 – mom.HostMonitor – INFO – Host Monitor starting
    2013-05-22 10:03:37,394 – mom – INFO – hypervisor interface libvirt
    2013-05-22 10:03:37,394 – mom.HostMonitor – DEBUG – Using fields: set([‘swap_out’, ‘mem_unuused’, ‘mem_available’, ‘anon_pages’, ‘mem_free’, ‘swap_in’])
    2013-05-22 10:03:37,404 – mom.HostMonitor – INFO – HostMonitor is ready
    2013-05-22 10:03:37,407 – mom.GuestManager – INFO – Guest Manager starting
    2013-05-22 10:03:37,428 – mom.Policy – INFO – Loaded policy ’50_main_’
    2013-05-22 10:03:37,428 – mom.PolicyEngine – INFO – Policy Engine starting
    2013-05-22 10:03:37,430 – mom.PolicyEngine – DEBUG – Loaded Balloon controller
    2013-05-22 10:03:37,430 – mom.RPCServer – INFO – RPC Server is disabled
    2013-05-22 10:03:37,434 – mom.Monitor – INFO – GuestMonitor-sbkqgis starting
    2013-05-22 10:03:37,434 – mom.Monitor – DEBUG – Using fields: set([‘swap_out’, ‘balloon_cur’, ‘mem_free’, ‘host_minor_faults’, ‘swap_in’, ‘major_fault’, ‘host_major_faults’, ‘mem_available’, ‘balloon_max’, ‘minor_fault’, ‘mem_unused’, ‘rss’])
    2013-05-22 10:03:37,437 – mom.Collectors.GuestMemory – WARNING – getVmMemoryStats() error: libvirt memoryStats() is not active
    2013-05-22 10:03:37,439 – mom.Collectors.GuestQemuAgent – DEBUG – Connection failed: ProtocolError (2): Connection failed: No such file or directory
    2013-05-22 10:03:37,439 – mom.Monitor – WARNING – GuestMonitor-sbkqgis: Collection error: Unable to connect to agent
    2013-05-22 10:03:37,439 – mom.Monitor – DEBUG – GuestMonitor-sbkqgis: Incomplete data: missing set([‘swap_out’, ‘mem_free’, ‘major_fault’, ‘swap_in’, ‘mem_unused’, ‘minor_fault’, ‘mem_available’])

    In /var/lib/libvirt/qemu I have following files: sbkqgis.monitor, va-sbkqgis-isa.sock, va-sbkqgis-virtio.sock. Should there be a sbkqgis.agent too?

    regards
    /Henrik

  2. Henrik Uggla says:

    I got it working! In the domain.xml it has to be:

    But then it’s only possible to have one channel defined.

  3. Henrik Uggla says:
  4. Henrik Uggla says:

    “”

  5. Henrik Uggla says:

    It seems I can’t post the code.
    Well, what I meant was:
    Remove one channel and change va-$NAME-isa.sock to $NAME.agent for the other one.

    cheers
    /Henrik

  6. Sebastian Maj says:

    Hi,
    I very like the idea of MOM – I’ve tried this configuration and I can’t get it to work. Qemu-GA have been installed, I can issue commands through agent but when I started MOM (as root) I’ve got:
    In mom log:
    mom.Collectors.GuestQemuAgent – DEBUG – Connection failed: ProtocolError (11): Connection failed: Resource temporarily unavailable
    2013-08-06 10:23:07,328 – mom.Monitor – WARNING – GuestMonitor-MY_VM_NAME: Collection error: Unable to connect to agent
    In libvirtd.log:
    702: error : virNetSocketReadWire:1184 : End of file while reading data: Input/output error
    Sys: Centos 6.4, MOM form EPEL.

    Adam, can you help me?

    • aglitke says:

      Could you pastebin your domain XML and reply with the link?

      702: error : virNetSocketReadWire:1184 : End of file while reading data: Input/output error

      This is a surprising error because qemu-ga traffic should flow over virtio-serial. Make sure you have lines like the following in your domain.xml:

      <serial type='unix'>
        <source mode='bind' path='/var/lib/libvirt/qemu/va-$NAME-isa.sock'/>
        <target port='1'/>
      </serial>
      <channel type='unix'>
        <source mode='bind' path='/var/lib/libvirt/qemu/va-$NAME-virtio.sock'/>
        <target type='virtio' name='org.qemu.guest_agent.0'/>
        <address type='virtio-serial' controller='0' bus='0' port='1'/>
      </channel>
      
  7. Sebastian Maj says:

    In the pastebin, name of agent socket and guest are different. It’s my mistake during pasting – on server it’s the same value.

    • aglitke says:

      After these messages appear, can you still connect to the agent to execute commands by another method? Is the guest agent still running? What messages have been entered into the agent log corresponding to the MOM and libvirt errors?

      • Sebastian Maj says:

        Yes, I can successfully issue agent commands via virsh. Agent is running but log is empty.

      • This is because the qemu-guest-agent socket is already taken by libvirtd, that automatically connects if the channel is defined. This connection is used to implement ‘virsh qemu-agent-command’ etc..

        A patch to fix this issue is committed into gerrit: http://gerrit.ovirt.org/#/c/18888/
        MoM with this patch can use the libvirt API (that is also used by virsh) to communicate with qemu-guest-agent as default.

      • Tomoki Sekiyama says:

        Thanks for review for my patch.
        It seems failed to merge because it is on some patches, starting from http://gerrit.ovirt.org/#/c/18887/ , which fix exception when an error is reported from qemu-guest-agent.
        Could you review for them too?
        Thanks,

  8. Henrik Uggla says:

    I updated to latest mom in git today and have some trouble. I get the guest ready message but no memory ballooning is taking place and I get this error over and over again:

    mom.Policy – ERROR – Policy error: undefined symbol guest_free_percent

    Please help.

    • aglitke says:

      Hi. Thanks for the report.

      Can you show me the exact copy of the mom policy and configuration file you are using? I cannot reproduce this problem locally.

      • Henrik Uggla says:

        mom.rules:
        ### Auto-Balloon ###############################################################

        ### Constants
        # If the percentage of host free memory drops below this value
        # then we will consider the host to be under memory pressure
        (defvar pressure_threshold 0.20)

        # If pressure threshold drops below this level, then the pressure
        # is critical and more aggressive ballooning will be employed.
        (defvar pressure_critical 0.05)

        # This is the minimum percentage of free memory that an unconstrained
        # guest would like to maintain
        (defvar min_guest_free_percent 0.20)

        # Don’t change a guest’s memory by more than this percent of total memory
        (defvar max_balloon_change_percent 0.05)

        # Only ballooning operations that change the balloon by this percentage
        # of current guest memory should be undertaken to avoid overhead
        (defvar min_balloon_change_percent 0.0025)

        ### Helper functions
        # Check if the proposed new balloon value is a large-enough
        # change to justify a balloon operation. This prevents us from
        # introducing overhead through lots of small ballooning operations
        (def change_big_enough (guest new_val)
        {
        (if (> (abs (- new_val guest.balloon_cur))
        (* min_balloon_change_percent guest.balloon_cur))
        1 0)
        })

        (def shrink_guest (guest)
        {
        # Determine the degree of host memory pressure
        (if (<= host_free_percent pressure_critical)
        # Pressure is critical:
        # Force guest to swap by making free memory negative
        (defvar guest_free_percent (+ -0.05 host_free_percent))
        # Normal pressure situation
        # Scale the guest free memory back according to host pressure
        (defvar guest_free_percent (* min_guest_free_percent
        (/ host_free_percent pressure_threshold))))

        # Given current conditions, determine the ideal guest memory size
        (defvar guest_used_mem (- (guest.StatAvg "balloon_cur")
        (guest.StatAvg "mem_unused")))
        (defvar balloon_min (min guest.balloon_min (+ guest_used_mem
        (* guest_free_percent guest.balloon_cur))))
        # But do not change it too fast
        (defvar balloon_size (* guest.balloon_cur
        (- 1 max_balloon_change_percent)))
        (if (< balloon_size balloon_min)
        (set balloon_size balloon_min)
        0)
        # Set the new target for the BalloonController. Only set it if the
        # value makes sense and is a large enough change to be worth it.
        (if (and (<= balloon_size guest.balloon_cur)
        (change_big_enough guest balloon_size))
        (guest.Control "balloon_target" balloon_size)
        0)
        })

        (def grow_guest (guest)
        {
        # There is only work to do if the guest is ballooned
        (if ( balloon_size guest.balloon_max)
        (set balloon_size guest.balloon_max) 0)
        (if (< balloon_size balloon_min)
        (set balloon_size balloon_min) 0)
        (if (change_big_enough guest balloon_size)
        (guest.Control "balloon_target" balloon_size) 0)
        } 0)
        })

        ### Main script
        # Methodology: The goal is to shrink all guests fairly and by an amount
        # scaled to the level of host memory pressure. If the host is under
        # severe pressure, scale back more aggressively. We don't yet handle
        # symptoms of over-ballooning guests or try to balloon idle guests more
        # aggressively. When the host is not under memory pressure, slowly
        # deflate the balloons.

        (defvar host_free_percent (/ (Host.StatAvg "mem_free") Host.mem_available))
        (if (< host_free_percent pressure_threshold)
        (with Guests guest (shrink_guest guest))
        (with Guests guest (grow_guest guest)))

      • aglitke says:

        Henrik,

        I have found a few syntax errors in your rules file.

        --- mom.rules	2013-09-19 08:32:48.990050945 -0400
        +++ mom.rules.new	2013-09-19 08:34:04.966052332 -0400
        @@ -66,7 +66,7 @@
         (def grow_guest (guest)
         {
             # There is only work to do if the guest is ballooned
        -    (if ( balloon_size guest.balloon_max)
        +    (if (< balloon_size guest.balloon_max)
                 (set balloon_size guest.balloon_max) 0)
         
             (if (< balloon_size balloon_min)
        @@ -74,7 +74,6 @@
         
             (if (change_big_enough guest balloon_size)
                 (guest.Control "balloon_target" balloon_size) 0)
        -    } 0)
         })
         
         ### Main script
        
        

        You are missing a ‘<' in that first 'if' and you had an extraneous '} 0)' near the end of 'grow_guest'.

      • Henrik Uggla says:

        mom.conf:

        [main]
        # The wake up frequency of the main daemon (in seconds)
        main-loop-interval: 5

        # The data collection interval for host statistics (in seconds)
        host-monitor-interval: 5

        # The data collection interval for guest statistics (in seconds)
        guest-monitor-interval: 5

        # The wake up frequency of the guest manager (in seconds). The guest manager
        # sets up monitoring and control for newly-created guests and cleans up after
        # deleted guests.
        guest-manager-interval: 5

        # The wake up frequency of the policy engine (in seconds). During each
        # interval the policy engine evaluates the policy and passes the results
        # to each enabled controller plugin.
        policy-engine-interval: 10

        # The interface MOM using to discover active guests and collect guest memory
        # statistics. There’re two choices for it: libvirt or vdsm.
        hypervisor-interface: libvirt

        # A comma-separated list of Controller plugins to enable
        controllers: Balloon

        # Sets the maximum number of statistic samples to keep for the purpose of
        # calculating moving averages.
        sample-history-length: 10

        # The URI to use when connecting to this host’s libvirt interface. If this is
        # left blank then the system default URI is used.
        libvirt-hypervisor-uri:

        # Set this to an existing, writable directory to enable plotting. For each
        # invocation of the program a subdirectory momplot-NNN will be created where NNN
        # is a sequence number. Within that directory, tab-delimited data files will be
        # created and updated with all data generated by the configured Collectors.
        plot-dir:

        # Activate the RPC server on the designated port (-1 to disable). RPC is
        # disabled by default until authentication is added to the protocol.
        rpc-port: -1

        # At startup, load a policy from the given file. If empty, no policy is loaded
        policy:

        # At startup, load policies from the given directory. Only filenames matching
        # *.policy will be considered. Each loaded policy will be named according to
        # the file’s basename. Policies are concatenated in alphabetical order by name
        # for evaluation.
        policy-dir:

        [logging]
        # Set the destination for program log messages. This can be either ‘stdio’ or
        # a filename. When the log goes to a file, log rotation will be done
        # automatically.
        #log: stdio

        # Set the logging verbosity level. The following levels are supported:
        # 5 or debug: Debugging messages
        # 4 or info: Detailed messages concerning normal program operation
        # 3 or warn: Warning messages (program operation may be impacted)
        # 2 or error: Errors that severely impact program operation
        # 1 or critical: Emergency conditions
        # This option can be specified by number or name.
        verbosity: debug

        log: /var/log/mom.log
        ## The following two variables are used only when logging is directed to a file.
        # Set the maximum size of a log file (in bytes) before it is rotated.
        max-bytes: 2097152
        # Set the maximum number of rotated logs to retain.
        backup-count: 3

        [host]
        # A comma-separated list of Collector plugins to use for Host data collection.
        collectors: HostMemory

        [guest]
        # A comma-separated list of Collector plugins to use for Guest data collection.
        collectors: GuestQemuProc, GuestMemory, GuestBalloon, GuestQemuAgent

        # Collector-specific configuration for GuestQemuAgent
        [Collector: GuestQemuAgent]
        # Set the base path where the host-side sockets for guest communication can be
        # found. The GuestQemuAgent Collector will try to open the socket:
        # /.agent
        socket_path: /var/lib/libvirt/qemu

        [Collector: GuestNetworkDaemon]
        # Helper program to convert guest names to IP addresses. This is only used by
        # the GuestNetworkDaemon Collector. See doc/name-to-ip for an example.
        #name-to-ip-helper: doc/name-to-ip

  9. Henrik Uggla says:

    I redid the install and now my mom.rules look like this. I still get the same error.

    ### Auto-Balloon ###############################################################

    ### Constants
    # If the percentage of host free memory drops below this value
    # then we will consider the host to be under memory pressure
    (defvar pressure_threshold 0.20)

    # If pressure threshold drops below this level, then the pressure
    # is critical and more aggressive ballooning will be employed.
    (defvar pressure_critical 0.05)

    # This is the minimum percentage of free memory that an unconstrained
    # guest would like to maintain
    (defvar min_guest_free_percent 0.20)

    # Don’t change a guest’s memory by more than this percent of total memory
    (defvar max_balloon_change_percent 0.05)

    # Only ballooning operations that change the balloon by this percentage
    # of current guest memory should be undertaken to avoid overhead
    (defvar min_balloon_change_percent 0.0025)

    ### Helper functions
    # Check if the proposed new balloon value is a large-enough
    # change to justify a balloon operation. This prevents us from
    # introducing overhead through lots of small ballooning operations
    (def change_big_enough (guest new_val)
    {
    (if (> (abs (- new_val guest.balloon_cur))
    (* min_balloon_change_percent guest.balloon_cur))
    1 0)
    })

    (def shrink_guest (guest)
    {
    # Determine the degree of host memory pressure
    (if (<= host_free_percent pressure_critical)
    # Pressure is critical:
    # Force guest to swap by making free memory negative
    (defvar guest_free_percent (+ -0.05 host_free_percent))
    # Normal pressure situation
    # Scale the guest free memory back according to host pressure
    (defvar guest_free_percent (* min_guest_free_percent
    (/ host_free_percent pressure_threshold))))

    # Given current conditions, determine the ideal guest memory size
    (defvar guest_used_mem (- (guest.StatAvg "balloon_cur")
    (guest.StatAvg "mem_unused")))
    (defvar balloon_min (min guest.balloon_min (+ guest_used_mem
    (* guest_free_percent guest.balloon_cur))))
    # But do not change it too fast
    (defvar balloon_size (* guest.balloon_cur
    (- 1 max_balloon_change_percent)))
    (if (< balloon_size balloon_min)
    (set balloon_size balloon_min)
    0)
    # Set the new target for the BalloonController. Only set it if the
    # value makes sense and is a large enough change to be worth it.
    (if (and (<= balloon_size guest.balloon_cur)
    (change_big_enough guest balloon_size))
    (guest.Control "balloon_target" balloon_size)
    0)
    })

    (def grow_guest (guest)
    {
    # There is only work to do if the guest is ballooned
    (if ( balloon_size guest.balloon_max)
    (set balloon_size guest.balloon_max) 0)
    (if (< balloon_size balloon_min)
    (set balloon_size balloon_min) 0)
    (if (change_big_enough guest balloon_size)
    (guest.Control "balloon_target" balloon_size) 0)
    } 0)
    })

    ### Main script
    # Methodology: The goal is to shrink all guests fairly and by an amount
    # scaled to the level of host memory pressure. If the host is under
    # severe pressure, scale back more aggressively. We don't yet handle
    # symptoms of over-ballooning guests or try to balloon idle guests more
    # aggressively. When the host is not under memory pressure, slowly
    # deflate the balloons.

    (defvar host_free_percent (/ (Host.StatAvg "mem_free") Host.mem_available))
    (if (< host_free_percent pressure_threshold)
    (with Guests guest (shrink_guest guest))
    (with Guests guest (grow_guest guest)))

    • Henrik Uggla says:

      It seems that pasting stuff here is changing the code. Should I use some kind of code tag?

      • Henrik Uggla says:


        ### Auto-Balloon ###############################################################

        ### Constants
        # If the percentage of host free memory drops below this value
        # then we will consider the host to be under memory pressure
        (defvar pressure_threshold 0.20)

        # If pressure threshold drops below this level, then the pressure
        # is critical and more aggressive ballooning will be employed.
        (defvar pressure_critical 0.05)

        # This is the minimum percentage of free memory that an unconstrained
        # guest would like to maintain
        (defvar min_guest_free_percent 0.20)

        # Don't change a guest's memory by more than this percent of total memory
        (defvar max_balloon_change_percent 0.05)

        # Only ballooning operations that change the balloon by this percentage
        # of current guest memory should be undertaken to avoid overhead
        (defvar min_balloon_change_percent 0.0025)

        ### Helper functions
        # Check if the proposed new balloon value is a large-enough
        # change to justify a balloon operation. This prevents us from
        # introducing overhead through lots of small ballooning operations
        (def change_big_enough (guest new_val)
        {
        (if (> (abs (- new_val guest.balloon_cur))
        (* min_balloon_change_percent guest.balloon_cur))
        1 0)
        })

        (def shrink_guest (guest)
        {
        # Determine the degree of host memory pressure
        (if (<= host_free_percent pressure_critical)
        # Pressure is critical:
        # Force guest to swap by making free memory negative
        (defvar guest_free_percent (+ -0.05 host_free_percent))
        # Normal pressure situation
        # Scale the guest free memory back according to host pressure
        (defvar guest_free_percent (* min_guest_free_percent
        (/ host_free_percent pressure_threshold))))

        # Given current conditions, determine the ideal guest memory size
        (defvar guest_used_mem (- (guest.StatAvg "balloon_cur")
        (guest.StatAvg "mem_unused")))
        (defvar balloon_min (min guest.balloon_min (+ guest_used_mem
        (* guest_free_percent guest.balloon_cur))))
        # But do not change it too fast
        (defvar balloon_size (* guest.balloon_cur
        (- 1 max_balloon_change_percent)))
        (if (< balloon_size balloon_min)
        (set balloon_size balloon_min)
        0)
        # Set the new target for the BalloonController. Only set it if the
        # value makes sense and is a large enough change to be worth it.
        (if (and (<= balloon_size guest.balloon_cur)
        (change_big_enough guest balloon_size))
        (guest.Control "balloon_target" balloon_size)
        0)
        })

        (def grow_guest (guest)
        {
        # There is only work to do if the guest is ballooned
        (if ( balloon_size guest.balloon_max)
        (set balloon_size guest.balloon_max) 0)
        (if (< balloon_size balloon_min)
        (set balloon_size balloon_min) 0)
        (if (change_big_enough guest balloon_size)
        (guest.Control "balloon_target" balloon_size) 0)
        } 0)
        })

        ### Main script
        # Methodology: The goal is to shrink all guests fairly and by an amount
        # scaled to the level of host memory pressure. If the host is under
        # severe pressure, scale back more aggressively. We don't yet handle
        # symptoms of over-ballooning guests or try to balloon idle guests more
        # aggressively. When the host is not under memory pressure, slowly
        # deflate the balloons.

        (defvar host_free_percent (/ (Host.StatAvg "mem_free") Host.mem_available))
        (if (< host_free_percent pressure_threshold)
        (with Guests guest (shrink_guest guest))
        (with Guests guest (grow_guest guest)))

  10. Henrik Uggla says:

    Adam, the errors you found was intruduced when I pasted the code in my reply, they are not present in my file.

  11. Henrik Uggla says:

    I’m running Ubuntu server 12.04 64bit (fully updated). I can post my mom.log tomorrow but it really doesn’t say much even with debug logging.

    • aglitke says:

      Hmm, it’s very strange that in your environment you’re getting a policy error and in mine I am not (with the same policy and config). Are you satisfied that the guest agent is returning all of the data?

      • Henrik Uggla says:

        Yes, it worked fine before the upgrade to latest in git. Could it be caused by different python versions?

  12. Henrik Uggla says:

    Here’s my mom.log:

    2013-09-19 15:19:33,086 – mom.HostMonitor – INFO – Host Monitor starting
    2013-09-19 15:19:33,086 – mom.HostMonitor – DEBUG – Using fields: set([‘swap_out’, ‘mem_available’, ‘anon_pages’, ‘mem_unused’, ‘mem_free’, ‘swap_in’])
    2013-09-19 15:19:33,088 – mom.HostMonitor – INFO – HostMonitor is ready
    2013-09-19 15:19:33,086 – mom – INFO – hypervisor interface libvirt
    2013-09-19 15:19:33,145 – mom.GuestManager – INFO – Guest Manager starting
    2013-09-19 15:19:33,163 – mom.Policy – INFO – Loaded policy ’50_main_’
    2013-09-19 15:19:33,163 – mom.PolicyEngine – INFO – Policy Engine starting
    2013-09-19 15:19:33,174 – mom.PolicyEngine – DEBUG – Loaded Balloon controller
    2013-09-19 15:19:33,175 – mom.RPCServer – INFO – RPC Server is disabled
    2013-09-19 15:19:33,237 – mom.Monitor – INFO – GuestMonitor-sbkqgis starting
    2013-09-19 15:19:33,237 – mom.Monitor – DEBUG – Using fields: set([‘swap_out’, ‘balloon_cur’, ‘mem_free’, ‘balloon_min’, ‘host_minor_faults’, ‘swap_in’, ‘major_fault’, ‘host_major_faults’, ‘mem_available’, ‘balloon_max’, ‘minor_fault’, ‘mem_unused’, ‘rss’])
    2013-09-19 15:19:33,241 – mom.Collectors.GuestMemory – WARNING – getVmMemoryStats() error: libvirt memoryStats() is not active
    2013-09-19 15:19:33,245 – mom.Monitor – INFO – GuestMonitor-sbkgeodata starting
    2013-09-19 15:19:33,246 – mom.Monitor – DEBUG – Using fields: set([‘swap_out’, ‘balloon_cur’, ‘mem_free’, ‘balloon_min’, ‘host_minor_faults’, ‘swap_in’, ‘major_fault’, ‘host_major_faults’, ‘mem_available’, ‘balloon_max’, ‘minor_fault’, ‘mem_unused’, ‘rss’])
    2013-09-19 15:19:33,250 – mom.Collectors.GuestMemory – WARNING – getVmMemoryStats() error: libvirt memoryStats() is not active
    2013-09-19 15:19:33,258 – mom.Monitor – INFO – GuestMonitor-skgis2 starting
    2013-09-19 15:19:33,258 – mom.Monitor – DEBUG – Using fields: set([‘swap_out’, ‘balloon_cur’, ‘mem_free’, ‘balloon_min’, ‘host_minor_faults’, ‘swap_in’, ‘major_fault’, ‘host_major_faults’, ‘mem_available’, ‘balloon_max’, ‘minor_fault’, ‘mem_unused’, ‘rss’])
    2013-09-19 15:19:33,260 – mom.Collectors.GuestMemory – WARNING – getVmMemoryStats() error: libvirt memoryStats() is not active
    2013-09-19 15:19:34,025 – mom.Monitor – INFO – GuestMonitor-sbkgeodata is ready
    2013-09-19 15:19:34,033 – mom.Monitor – INFO – GuestMonitor-sbkqgis is ready
    2013-09-19 15:19:34,040 – mom.Monitor – INFO – GuestMonitor-skgis2 is ready
    2013-09-19 15:19:43,189 – mom.Policy – ERROR – Policy error: undefined symbol guest_free_percent
    2013-09-19 15:19:53,202 – mom.Policy – ERROR – Policy error: undefined symbol guest_free_percent
    2013-09-19 15:20:03,218 – mom.Policy – ERROR – Policy error: undefined symbol guest_free_percent

  13. Henrik Uggla says:

    Adam, if you’ll tell me how to download a slightly older version I could perhaps help narrowing the problem down. The version that did work was downloaded from git 21/5-2013.

    • aglitke says:

      Sure. Sounds like a perfect time to use ‘git-bisect’. Looks like the hash of the known good revision is 2176f367b305d24bd279a0d5e44112224c870329 (but please confirm) and HEAD is bad for you. Since then we have made changes to the default policy. Here is a good tutorial on how to use git bisect. Thanks for being persistent! I hope we’ll get to the bottom of this soon.

      • Henrik Uggla says:

        Ok, I’ve narrowed it down. ba40e873d293ba9d741afaf7a53f53ff4d17c33e was the last one to work. The next did not.

        mom.log:
        2013-09-20 15:49:13,116 – mom – INFO – MOM starting
        2013-09-20 15:49:13,124 – mom.HostMonitor – INFO – Host Monitor starting
        2013-09-20 15:49:13,124 – mom – INFO – hypervisor interface libvirt
        2013-09-20 15:49:13,124 – mom.HostMonitor – DEBUG – Using fields: set([‘swap_out’, ‘mem_unuused’, ‘mem_available’, ‘anon_pages’, ‘mem_free’, ‘swap_in’])
        2013-09-20 15:49:13,168 – mom.HostMonitor – INFO – HostMonitor is ready
        2013-09-20 15:49:13,170 – mom.GuestManager – INFO – Guest Manager starting
        2013-09-20 15:49:13,241 – mom.Policy – INFO – Loaded policy ’50_main_’
        2013-09-20 15:49:13,265 – mom.PolicyEngine – INFO – Policy Engine starting
        2013-09-20 15:49:13,266 – mom.RPCServer – INFO – RPC Server is disabled
        2013-09-20 15:49:13,268 – mom.PolicyEngine – DEBUG – Loaded Balloon controller
        2013-09-20 15:49:13,269 – mom.Monitor – INFO – GuestMonitor-sbkqgis starting
        2013-09-20 15:49:13,269 – mom.Monitor – DEBUG – Using fields: set([‘swap_out’, ‘balloon_cur’, ‘mem_free’, ‘host_minor_faults’, ‘swap_in’, ‘major_fault’, ‘host_major_faults’, ‘mem_available’, ‘balloon_max’, ‘minor_fault’, ‘min_guarantee’, ‘mem_unused’, ‘rss’])
        2013-09-20 15:49:13,272 – mom.Collectors.GuestMemory – WARNING – getVmMemoryStats() error: libvirt memoryStats() is not active
        2013-09-20 15:49:13,279 – mom.Monitor – INFO – GuestMonitor-sbkgeodata starting
        2013-09-20 15:49:13,279 – mom.Monitor – DEBUG – Using fields: set([‘swap_out’, ‘balloon_cur’, ‘mem_free’, ‘host_minor_faults’, ‘swap_in’, ‘major_fault’, ‘host_major_faults’, ‘mem_available’, ‘balloon_max’, ‘minor_fault’, ‘min_guarantee’, ‘mem_unused’, ‘rss’])
        2013-09-20 15:49:13,282 – mom.Collectors.GuestMemory – WARNING – getVmMemoryStats() error: libvirt memoryStats() is not active
        2013-09-20 15:49:13,289 – mom.Monitor – INFO – GuestMonitor-skgis2 starting
        2013-09-20 15:49:13,289 – mom.Monitor – DEBUG – Using fields: set([‘swap_out’, ‘balloon_cur’, ‘mem_free’, ‘host_minor_faults’, ‘swap_in’, ‘major_fault’, ‘host_major_faults’, ‘mem_available’, ‘balloon_max’, ‘minor_fault’, ‘min_guarantee’, ‘mem_unused’, ‘rss’])
        2013-09-20 15:49:13,291 – mom.Collectors.GuestMemory – WARNING – getVmMemoryStats() error: libvirt memoryStats() is not active
        2013-09-20 15:49:23,277 – mom.Policy – DEBUG – Results: [0.2, 0.05, 0.2, 0.05, 0.0025, ‘change_big_enough’, ‘shrink_guest’, ‘grow_guest’, 0.6521074510732292, []]
        2013-09-20 15:49:23,311 – mom.Monitor – INFO – GuestMonitor-sbkqgis starting
        2013-09-20 15:49:23,311 – mom.Monitor – DEBUG – Using fields: set([‘swap_out’, ‘balloon_cur’, ‘mem_free’, ‘host_minor_faults’, ‘swap_in’, ‘major_fault’, ‘host_major_faults’, ‘mem_available’, ‘balloon_max’, ‘minor_fault’, ‘min_guarantee’, ‘mem_unused’, ‘rss’])
        2013-09-20 15:49:23,314 – mom.Collectors.GuestMemory – WARNING – getVmMemoryStats() error: libvirt memoryStats() is not active

      • aglitke says:

        Thanks! I was able to reproduce and I think I have your fix here: http://gerrit.ovirt.org/#/c/19416/

        Could you test this out and let me know if it is working for you?

      • Uggla says:

        Thanks for the quick fix! I’ll try it when I get back to work on monday.

      • Henrik Uggla says:

        Works very well. Thanks!

  14. Henrik Uggla says:

    Is there some way to restart mom without having to reboot the host?

    • aglitke says:

      Sure. If you’ve installed the RPM you can do ‘service momd restart’. Otherwise just kill -SIGINT and restart it. It will relaunch all of the necessary threads.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s