User Tools

Site Tools


projects:bpm-sis18:status

This is an old revision of the document!


Problems / System behaviour

Errors occurring sporadically:

  • the Liberas loose their connection: they appear red in the detailed status panel, giving the status “Software Error” or “BPM Communication Error”. The Libera can not be controlled from TOPOS any more. Most of the time, a reboot of the Liberavia ssh helps.

In at least one case, I also had to restart the FESA classes on the CCCPs to make the system working again. This happens sometimes during beamtime. Or after the system was not used for a while and is started again (= the GUI was closed for a while and is started again).

Other errors which could be related:

  • the saving of raw data sometimes leads to a timeout
  • aux BPM confuses the whole system (Start and Stop is triggered “out of itself”)
  • low system performance (switching of mode lasts several seconds)
Debugging ideas:

Suggestions (Every idea to track down the problems are appreciated):

  • add debug output into the Libera generic servers and the FESA Classes (see below)
  • make testcases like: set a defined set of calibration values and check if they are set in the generic server/FPGA registers

Log in the FESA classes:

  • version number on startup
  • connection to BPM established/lost/reconnected
  • debug output at every status change of the system (Initializing, Start, Stop,…)

Log on the generic servers (with timestamps!):

  • version number (or similar) at startup
  • internal register values (when changed, on start, on stop)
  • when a start or stop trigger arrives
  • when the ring buffer is full
  • operating mode (raw, bunch to bunch)
  • log any other useful events

Network considerations:

  • put the Liberas into a own network (not the GSI/ACC network)
  • bootfile on CCCPs, static IPs, nfs mount to store debug logs

Would it make sense to have a standalone tool to see if the FESA serversm the BPMs, the gen Servers are up and running and without an error flag?!

Goals

Add logging output to all system components to know what is going on in each component. Provide tools to observe the health status of the system components.

projects/bpm-sis18/status.1342019921.txt.gz · Last modified: 2012/07/11 17:18 by rhaseitl