FreeEMS Issues - Loader
View Issue Details
0000270LoaderSerial Monitor Commspublic2011-09-24 13:412013-01-18 23:07
Fred 
sean94z 
highminoralways
assignedreopened 
LinuxDebianSid/Unstable
 
0.1.00.1.0 
Bug
medium
0000270: First Connect Fails Several Times
First hit of connect returns instantly without success
Second hit hangs for a while, then returns without success
Third hit returns instantly without success
Fourth hit works reasonably

And variations of the above. Please fix, costing me time while loading firmware during dev.
No tags attached.

Notes
(0000470)
sean94z   
2011-10-24 21:36   
fixed since commit f5627738cac3a3dc06dda087a1d0b761aacd7411
(0000481)
Fred   
2011-10-24 23:13   
Will test and close tomorrow!
(0000483)
Fred   
2011-10-26 08:50   
I saw this yesterday, but not the same as before, and it was probably a one off, but, it got me thinking... :-) Yes, I know, dangerous...

Does the code currently do this:

 - Flush
 - Send init sequence
 - Listen for reply with short timeout
 - If receive reply, set connected state and mark load button
 - If not get reply, mark connect button again
 - Stop

If so, what we should probably do is have a setting for connect retries such that that sequence, not including stop, is repeated N times in quick succession if not successful the first time. Due to the naturally short time out the user would barely notice if it had to do it 3 times and the chances are it would *always* work on the second round.

This covers the case where you just HAPPEN to get a framing error in that first try, or something like that. It should also be really easy to do in code and really effective! Thoughts?
(0000484)
Fred   
2011-10-26 08:51   
Back to you for comment and/or minor change as per last note.
(0000487)
sean94z   
2011-10-27 00:39   
the code currently does this:

 - Send init sequence
 - Flush
 - Listen for reply with short timeout 3x
 - If receive reply, set connected state and mark load button
 - I could change the text to "try to connect again"
 - Stop
(0000488)
Fred   
2011-10-27 07:22   
Hmmm, I may be off track here, but that seems wrong to me.

If you flush after you send the init sequence, you risk flushing the reply!

Also, what good does listening three times do? Unless you're listening too soon, in which case a very short delay in the thread doing the listening (it is threaded, right? :-) ) should make sure the reply is ready to RX when you "listen".

Don't change the button to "connect again", that'd be poor style, just output "connect failed, please try again" in the text output area or a pop up, preferbaly just the text area.
(0000492)
sean94z   
2011-10-27 14:27   
Sorry I left something out.

 - Send serial init
 - Flush
 - Send SM sequence
 - Listen for reply with short timeout 3x
 - If receive reply, set connected state and mark load button
 - I could change the text to "try to connect again"
 - Stop

Dave turned me onto *selecting instead of *reading from the serial port. A select will wait until a char comes in, while a *read may return nothing. All *selects are tried three times before the serialCode kicks back a read error.
(0000493)
Fred   
2011-10-27 14:39   
So by "send serial init" you don't mean that at all. You mean init the serial port? And by Send SM Sequence you mean send the open connection stuff to the SM?

So the logic on the selects is which:

A do select, if timeout do again, etc
B do select, if wrong data OR timeout do again, etc.

If B that's got to be wrong... but I suspect your wording was vague, not the code wrong, right?

Actually, reading again, "serialCode" is your built-in lib? And you do three selects on all fail to reads? With no ability to override the time out? Hmmm. The time out for a GP read would want to be very short so that it returned quickly, but the timeout on a specific speed port and setup would want to be tuned to the requirements of that setup such that you're sure it should have arrived by now and can confidently issue an error. Hmmm, I'll leave that stuff in your hands :-)

However, what IS clear is that you're only sending the sm sequence once, and only listening for it once (with the low level retry), and it would be good if your high level stuff could also do retries on the sending level too.
(0000553)
Fred   
2011-11-11 18:26   
On the mac it does one fail to connect and then succeeds most times. Just FYI. I think this logic, whatever it is, needs a rework.
(0000554)
sean94z   
2011-11-11 19:26   
I think it just needs a couple sleep()s.
(0000557)
Fred   
2011-11-11 20:05   
How can a sleep call possibly do anything when it's locked up and already in a coma?
(0000558)
sean94z   
2011-11-11 20:16   
hmmm maybe I need a couple sleep()s LOL
(0002271)
Fred   
2012-10-06 12:51   
LOL @ last comment. Rather than making a new issue:

(14:10:47) Fred: first connect blocks for a second or so
(14:10:52) Fred: second one works instantly

This is DEFINITELY solvable. I know because I've done it:

fred@cheetah:~/workspaces/eclipse/serial-monitor$ time java -jar target/serial-monitor-0.0.1-SNAPSHOT-bin.jar /dev/ttyUSB0
Opening serial device: /dev/ttyUSB0
Wrote 0x0DGoing to sleep for 6
Over slept by 4
 Got 3 bytes! Successful open: 3 {0xE0, 0x08, 0x3E, }

real 0m0.295s
user 0m0.228s
sys 0m0.040s

295 milliseconds includes JVM startup, class loading, etc, etc, etc...

The truth of the matter is that after just 2 milliseconds you can have your answer. Something is wrong with the way you're handling serial reads IMO.
(0002274)
Fred   
2012-10-06 20:31   
(21:54:15) Fred: Error: Data read, but it was not a serial monitor ACK
(21:54:15) Fred: Info: Unable to summon SerialMonitor
(21:54:15) Fred: how many retires?
(21:54:34) Fred: Info: closing serial port
(21:54:38) Fred: on second connect...
(21:54:58) Fred: p, li { white-space: pre-wrap; } serial monitor already running
(21:55:05) Fred: so the first one DID succeed...
(21:55:28) Fred: flush your buffer(s) before sending/reading?
(21:56:53) Fred: close/reset should return instantly.
(21:57:00) Fred: and always
(21:57:10) Fred: it virtually can't fail to send one byte and close the device
(21:58:10) Fred: holy shit
(21:58:21) Fred: i close the WINDOW and it takes a full second to terminate
(21:58:23) Fred: why
(21:58:25) Fred: not cool
(0002276)
sean94z   
2012-10-06 21:44   
(21:54:15) Fred: how many retires? 0

(21:55:28) Fred: flush your buffer(s) before sending/reading? It's flushed after the port has been opened.

(21:56:53) Fred: close/reset should return instantly. The read() block has to expire before the threads can read the terminate request. The block has been reduced greatly in the serialLib.

I was unable to repeat this error on my recent work, but let me see if I can figure out what's going on via the clues you gave me.
(0002278)
Fred   
2012-10-07 00:18   
If blocking on a specific quantity of data, it should be for a specific time.

If blocking on single byte reads into a buffer, it should be for a short time.

I also just had a rip fail in the middle, that's pretty disappointing as there is no real reason not to keep trying quite hard to complete.
(0002279)
sean94z   
2012-10-07 00:21   
"If blocking on a specific quantity of data, it should be for a specific time." Maybe, but don’t some SM operations take longer to complete than others???

right, read operations are easy to *retry
(0002280)
Fred   
2012-10-07 00:22   
Yes, but you know up front, even if only empirically, how long that is.
(0002281)
sean94z   
2012-10-07 00:28   
I think the generic time-out is currently 2 seconds.
(0002282)
Fred   
2012-10-07 00:35   
Seems like a life time.

For a single byte I wait just 2ms. The worst case is erase all, which takes under 2.6 seconds.

Everything else is fast.

Block is just snother way of sleeping, really.
(0002283)
sean94z   
2012-10-07 00:37   
Yeah over kill for all but erase all. It was just a quick&dirty way to get it done in a hurry.
(0002284)
sean94z   
2012-10-07 00:43   
quick/dirty/temporary :-p
(0002475)
sean94z   
2012-12-13 16:20   
The connect button has been removed all together, this issue can no longer exist.
(0002476)
Fred   
2012-12-13 16:23   
Presumably it still connects, why can't it fail the same way?
(0002535)
Fred   
2013-01-18 23:04   
Still does the first connect fails thing.
(0002536)
sean94z   
2013-01-18 23:07   
I cant reproduce it.... Let me try a couple other rs232 devices.