-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Froze at "Sending data via Ethernet" #39
Comments
Here's the code from the arduino. It seems like there's no "else: write to SD and continue". Design-wise, it seems like it would be more efficient to allow the ODIn to queue data to send to the server, in cases where the connection is down. Lines 612 to 624 in 85af6f1
|
The reason there is no else branch to the if statement here is because the boolean variable that is being tested has been hard-coded to true at the top of this source code file. I'll take a look through the Ethernet library that was used to see if we might have missed anything when we wrote the software. |
Those syslog entries, e.g. the avahi don't seem to be related. I don't think that is the IP of minun, right? One other thing I might recommend is to look to see if the Arduino libraries for the ethernet hardware have been updated since the Arduino software was last built. I wonder if the ethernet send failed and the device froze. It may also be possible to begin/endPacket in a way that is non-blocking. |
@dacb No, the ip is different, but these files came off of the muninn. Any idea why that is? It's on a static IP that the department assigned. Also as for the Ethernet & libraries - the ODIn is still using the old ethernet sheild (as in, hasn't been replaced since I've started using it) but I did purchase a replacement shield. I just haven't installed it yet. It's an Arduino Ethernet Sheild 2 |
I think those |
The Ethernet library that we use I believe comes with the Arduino IDE package of standard libraries, which they publish reference documentation for. I looked through the release notes for newer versions and nothing relevant to our situation popped out at me since the 1.0.6 release that we have been using. What I suspect happened is that the ODIn software failed on line #621, where the data is being written into the packet to send to the server. The Ethernet library reference documentation for the write() function states that it must "be wrapped between beginPacket() and endPacket()." In the source code it is indeed placed between calls to those two functions, however I suspect that an uncaught error when calling beginPacket() sets up some undefined behavior when the write() function is called, causing the ODIn system to freeze. A follow-up question then would be, what caused the call to the beginPacket() function to fail? In the Ethernet library reference documentation I noticed the existence of a maintain() function, which we do not currently use in our source code. Its documentation warns that this function needs to be called on a sub-second interval in order to successfully renegotiate the DHCP lease before the network router leases the IP Address to someone else. Internally the software keeps track of when the lease expires so calls to the maintain() function do nothing unless it is time for DHCP renegotiation. Unfortunately, the data sampling section of the loop() function contains multiple delays which can potentially provide enough of a gap for potentially bad network behavior to occur. |
Pull request #41 contains changes that should help make the Ethernet connection more resilient. |
Thanks @jsebrof. I'm running your new code right now. Is there a certain length of time I should run it in order to confirm it works, or any other thing I could do to test the changes? |
It should be working pretty great if you are receiving information from the
ODIn on the server.
In order to truly test all of the new error-handling, you'd need to log
into the network router, cancel the dhcp lease that the ODIn is using, and
then connect new network devices until the IP Address that the ODIn had
been using is given to one of them.
This would force the ODIn to reconfigure its dhcp lease in order to
continue using the Ethernet connection.
…On Tue, Jul 2, 2019 at 11:34 AM Jessica Hardwicke ***@***.***> wrote:
Thanks @jsebrof <https://github.com/jsebrof>. I'm running your new code
right now. Is there a certain length of time I should run it in order to
confirm it works, or any other thing I could do to test the changes?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#39?email_source=notifications&email_token=AAWMQZX2FHVCBUQTW5U5NMTP5ONVHA5CNFSM4HONAVNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZCFUGA#issuecomment-507795992>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAWMQZXBQNQJ3BLCYNIKRALP5ONVHANCNFSM4HONAVNA>
.
--
Jonathan Forbes
|
@jsebrof the server is on a static IP address and the arduino is not (I believe) although it uses an ethernet connection. Would you happen to know how this affects the dhcp lease? On the raspberrypi server I entered the following:
I'm not sure if that accomplished what you suggested, but the data is still coming in from the device without issue. I'm also curious - if I assigned a static IP address to the ODIn as well would this circumvent the issue? |
Yes, the server has a static IP address, and the ODIn is programmed to send
data to that IP Address specifically.
I don't think there is anything that you can do on the raspberry pi server
in order to test this networking issue.
What you would need to do is access the network router, which is an
entirely separate device itself, and then place the ODIn into a position
where it would need to reconfigure its Ethernet configuration.
An alternative to this would be watching the server logs, and observing if
the ODIn ever changes IP addresses during the course of a single run.
This method however could take awhile, as you mentioned that the networking
problem only occurred rarely.
The server needs to have a static IP address, because the ODIn needs to
know where to send the data, and wouldn't know if we didn't hard-code a
network location for it to send the data to.
The ODIn on the other hand, doesn't *need* to have a static IP address,
because all it does is send data to the server, and the server doesn't
particularly care where the data came from.
We could give the ODIn a static IP address, and that would circumvent this
dhcp lease issue.
This is also something that you would need access to the network router in
order to accomplish.
Let me know if you want to do this, and I'll change the Arduino code.
The modifications that were just made should solve this issue though,
provided the cause really was dhcp lease expiration.
|
I asked the IT department if they could give me a new static IP for the ODIn. Even if the code fixes it, I think it would be good to have a failsafe. If it's easy you can tell me where this info would go and I'll edit the code when I hear back from IT. Otherwise I'll forward the email to you. Thanks! |
As requested, pull request #42 contains modifications to have ODIn use a Static IP Address instead of DHCP. |
On 5/20 I came in an saw that the ODIn was frozen (platform was up to take a reading, was not shaking). The last reading recorded by the server was on 5/19/19 at 15:28. Until that point readings were looking as expected.
Here are the most recent lines from run.log, located in /var/www/ODIn_dataServer/
Here is some of the data from the system log, located at /var/log/syslog
And I noted that there is a syslog.1 file. Here is some of the data from that:
On 5/20 I shut the ODIn down and started a completely new run. So far I have not encountered this error. It seems to me that the main issue is that the ODIn isn't skipping sending data to the server when it can't make the connection, and instead freezes.
The text was updated successfully, but these errors were encountered: