This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
projects:bpm-sis18:status [2012/07/12 12:55] rhaseitl |
projects:bpm-sis18:status [2012/07/12 19:00] rhaseitl |
||
---|---|---|---|
Line 2: | Line 2: | ||
Errors occurring sporadically: | Errors occurring sporadically: | ||
- | * the Liberas loose their connection: they appear red in the detailed status panel, giving the status " | + | * the Liberas loose their connection: they appear red in the detailed status panel, giving the status " |
In at least one case, I also had to restart the FESA classes on the CCCPs to make the system working again. | In at least one case, I also had to restart the FESA classes on the CCCPs to make the system working again. | ||
This happens sometimes during beamtime. Or after the system was not used for a while and is started again (= the GUI was closed for a while and is started again). | This happens sometimes during beamtime. Or after the system was not used for a while and is started again (= the GUI was closed for a while and is started again). | ||
Line 22: | Line 22: | ||
* connection to BPM established/ | * connection to BPM established/ | ||
* debug output at every status change of the system (Initializing, | * debug output at every status change of the system (Initializing, | ||
+ | * logging should use the Log4j framework (from within the FESA class possible with SDLog (HBr)) | ||
+ | * the GUI should **not** encapsulate exceptions thrown by cmw / rda into its own Exception class (HBr) | ||
+ | * a detailed documentation of the meaning and reasns for each error message, exception etc. should be made (HBr) | ||
\\ | \\ | ||
Log on the generic servers (with timestamps!): | Log on the generic servers (with timestamps!): | ||
* version number (or similar) at startup | * version number (or similar) at startup | ||
- | * internal register values (when changed, on start, on stop) | + | * internal register values (when changed, on start trigger, on stop trigger) |
* when a start or stop trigger arrives | * when a start or stop trigger arrives | ||
* when the ring buffer is full | * when the ring buffer is full | ||
- | * operating mode (raw, bunch to bunch) | + | * operating mode (raw, bunch to bunch, calibrations, |
* log any other useful events | * log any other useful events | ||
+ | * log buffer overflows | ||
+ | * there seem to be logs on the liberas under /var/log. But without timestamps. When a separate network for the Liberas is used, the time can't be queried from a global NTP server. -> Setup a " | ||
\\ | \\ | ||
Line 40: | Line 45: | ||
\\ | \\ | ||
Connection to the PTIF: | Connection to the PTIF: | ||
- | * Display the connection status and if a command which has been sent, was "acknoleged" by the PTIF | + | * Display the connection status and if a command which has been sent, was "acknowledge" by the PTIF. Can give a hint, when the PTIF seems to be reachable by TCP/IP but the FESA class is not sending commands. Test case: Pull network cable and reconnect. System should be able to detect this, if the PTIF cannot be controlled afterwards. |
- | **Would it make sense to have a simple | + | Have a standalone tool to see ALL system components directly: |
+ | Some of this information is provided by the detailed | ||
- | Yes, I think it does! (HBr) | ||
== Goals == | == Goals == | ||
Line 51: | Line 56: | ||
Provide tools to observe the health status of the system components. | Provide tools to observe the health status of the system components. | ||
- | Some issues | + | |
- | * logging | + | |
- | * the GUI should | + | Some more considerations |
- | * a detailed documentation | + | * I strongly support to have as much logging |
+ | * From my point of view it is very important that we clearly understand what SHOULD happen, e.g. when the user presses a button, BEFORE we try to understand why something we intend to do DOES NOT happen. | ||
+ | | ||
+ | * I also support the idea of a stand-alone diagnostic tool as described above, just make sure the displayed information is clearly defined | ||