-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disconnection with error code [5154] and reason code [SESSION_TAKEN_OVER] #580
Comments
I enabled CRT tracing and native debugging by adding these VM options: When just running my client and waiting for a couple of minutes, I keep getting the SESSION_TAKEN_OVER:
Next time:
This goes on for a while until I stop my app. Some suspicous things that I see from the logs:
And after waiting for the 1 minute timeout:
Hope this helps, I can also provide full CRT logs if that would help. |
Another suspicious line from the same logs:
|
The SESSION_TAKEN_OVER error code is sent in a disconnect packet from the server when the server gets a connect packet containing the same client id as a client that is currently connected to it. The server will accept the connection from the new client and disconnect the previously connected client. You probably have two clients running concurrently using the same client id and they are taking turns disconnecting each other. To test whether this is the issue, try changing your createClient() function to use a UUID to generate a client id instead of using thing name at You can also setup and check Cloudwatch logs to see the incoming CONNECT packets to your MQTT endpoint. If you see more than one CONNECT packet with two different UUID client ids, that will also confirm that something is creating multiple clients and starting them up. |
Hi Kim, |
Hello Kim, W/o any changes to the thing name after starting my client, here are the events as shown in IoTCoreV2 logs on CloudWatch. I get 3 events: Connect, Subscribe, Disconnect in the first log stream: { { { then the next log stream shows a connect event: { and another log stream contains a disconnect: { |
When I stop my client i get client logs: 17:02:55.585 [AwsEventLoop 12] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with reason code [SESSION_TAKEN_OVER] and sessionExpiryIntervalSeconds [null]: null Process finished with exit code 130 (interrupted by signal 2:SIGINT) the event that matches the shutdown in CloudWatch: { and that is the last IoT Event i get there with the next minutes, so it seems this was indeed the only client running |
as expected, connect fails when using some random UUID: connection failure with error code: [5150]: assignedClientIdentifier [null], reasonCode [NOT_AUTHORIZED], reasonString [CONNACK:Client is not authenticated/authorized to send the message:0af2347e-d534-cb34-502c-af1bac3ebdb4] |
I also now added thread synchronization using a CountDownLatch so I make sure the subsribe is only done after the successful connect. The issue persists, however. |
Hello @kriehzoo , |
I feel like we are not going in the right direction in diagnosing this. What we need are SDK logs at the Debug level. You can find documentation for doing so here: https://awslabs.github.io/aws-crt-java/software/amazon/awssdk/crt/Log.html I will also echo what's been said previously: this behavior only occurs if multiple clients are fighting over the same client id. Beyond logs, a complete working sample that repros the issue would also be helpful. |
Hi @bretambrose with SDK logs at Debug level you mean CRT Logging? I enabled that before to get trace output and provided the relevant part. |
@alfred2g I will try that and let you know |
@alfred2g I added a second IoT policy that allows any client using the same certificate and I changed to use a UUID as client ID. |
Please include full debug logs. Snippets aren't of much use. |
@alfred2g I also removed the second IoT policy again and still I do not get the |
What about the |
@bretambrose next time it happens I will send you full debug logs. Thanks for your support, also to @alfred2g |
So this only happens with low frequency? |
@bretambrose it happened every time when I started the client and now it seems to be gone and to be honest, I am not sure, why |
If you could give me a quick indication on waitForNoResources() I could probably resolve this issue. |
Seems very likely that a second process -- either on your machine or on a colleague's machine -- was using the same client id and account. You don't need to bother with waitForNoResources. That's a public-internal API we used to verify native resources aren't leaking within tests/samples. |
@bretambrose thank you for the quick response, so I will remove the call to |
@bretambrose issue cannot be reproduced so I will close it |
This issue is now closed. Comments on closed issues are hard for our team to see. |
Describe the bug
I have a subsribing client written in Java and Spring Boot.
When I start the client I keep seeing
SESSION_TAKEN_OVER
.Already opened a case with AWS support but I was referred to this repo to open an issue here.
I know that
SESSION_TAKEN_OVER
will happen when I use the same client ID for different clients but I only start one client so this cannot be the issue.My suspicion is that resources may be leaking from a previous run?
Do I need some cleanup when shutting down my client? I came across
CrtResource.waitForNoResources()
so is this required?How can I detect if resources are leaking?
I use my IoT thing name as client ID.
Building my client looks like this:
for stop:
creating the client:
starting it:
subscribing:
shutdown:
startup:
So I think these are all relevant code snippets.
Expected Behavior
to not see SESSION_TAKEN_OVER at all
Current Behavior
15:45:33.113 [main] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - onApplicationEvent: org.springframework.context.event.ContextRefreshedEvent[source=org.springframework.boot.web.servlet.context.AnnotationConfigServletWebServerApplicationContext@305f031, started on Tue May 07 15:45:31 CEST 2024]
15:45:33.500 [main] INFO c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - Start
15:45:33.500 [main] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - subscribing to topic [connector/connector-608/event] with QOS [AT_LEAST_ONCE]
15:45:33.500 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - attempting connect
15:45:33.889 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - connection success with clientID [connector-608] and rejoinedSession [true] and sessionExpiryIntervalSeconds [3600] and serverKeepAliveSeconds [60] : assignedClientIdentifier [null], reasonCode [SUCCESS], reasonString [null], responseInformation [null], sessionExpiryIntervalSeconds [3600], sessionPresent [true], serverKeepAliveSeconds [60], userProperties []
15:45:33.960 [main] INFO c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - subscribed to topic [connector/connector-608/event] with reason codes [GRANTED_QOS_1]: null
15:45:33.965 [main] INFO c.m.c.s.ConnectorClientApplication - Started ConnectorClientApplication in 2.44 seconds (process running for 2.796)
15:45:34.167 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with error code [5154]
15:45:34.167 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with reason code [SESSION_TAKEN_OVER] and sessionExpiryIntervalSeconds [null]: null
15:47:01.925 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - attempting connect
15:47:02.426 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - connection success with clientID [connector-608] and rejoinedSession [true] and sessionExpiryIntervalSeconds [3600] and serverKeepAliveSeconds [60] : assignedClientIdentifier [null], reasonCode [SUCCESS], reasonString [null], responseInformation [null], sessionExpiryIntervalSeconds [3600], sessionPresent [true], serverKeepAliveSeconds [60], userProperties []
15:47:03.127 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with error code [5154]
15:47:03.127 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with reason code [SESSION_TAKEN_OVER] and sessionExpiryIntervalSeconds [null]: null
15:47:29.742 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - attempting connect
15:47:30.053 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - connection success with clientID [connector-608] and rejoinedSession [true] and sessionExpiryIntervalSeconds [3600] and serverKeepAliveSeconds [60] : assignedClientIdentifier [null], reasonCode [SUCCESS], reasonString [null], responseInformation [null], sessionExpiryIntervalSeconds [3600], sessionPresent [true], serverKeepAliveSeconds [60], userProperties []
15:47:31.157 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with error code [5154]
15:47:31.157 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with reason code [SESSION_TAKEN_OVER] and sessionExpiryIntervalSeconds [null]: null
15:49:31.158 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - attempting connect
15:49:31.471 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - connection success with clientID [connector-608] and rejoinedSession [true] and sessionExpiryIntervalSeconds [3600] and serverKeepAliveSeconds [60] : assignedClientIdentifier [null], reasonCode [SUCCESS], reasonString [null], responseInformation [null], sessionExpiryIntervalSeconds [3600], sessionPresent [true], serverKeepAliveSeconds [60], userProperties []
15:49:32.455 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with error code [5154]
15:49:32.455 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with reason code [SESSION_TAKEN_OVER] and sessionExpiryIntervalSeconds [null]: null
15:50:55.170 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - attempting connect
15:50:55.534 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - connection success with clientID [connector-608] and rejoinedSession [true] and sessionExpiryIntervalSeconds [3600] and serverKeepAliveSeconds [60] : assignedClientIdentifier [null], reasonCode [SUCCESS], reasonString [null], responseInformation [null], sessionExpiryIntervalSeconds [3600], sessionPresent [true], serverKeepAliveSeconds [60], userProperties []
15:50:56.696 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with error code [5154]
15:50:56.696 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with reason code [SESSION_TAKEN_OVER] and sessionExpiryIntervalSeconds [null]: null
15:51:08.092 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - attempting connect
15:51:08.392 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - connection success with clientID [connector-608] and rejoinedSession [true] and sessionExpiryIntervalSeconds [3600] and serverKeepAliveSeconds [60] : assignedClientIdentifier [null], reasonCode [SUCCESS], reasonString [null], responseInformation [null], sessionExpiryIntervalSeconds [3600], sessionPresent [true], serverKeepAliveSeconds [60], userProperties []
15:51:08.939 [AwsEventLoop 2] DEBUG c.m.c.s.i.m.IoTCoreMqtt5ConnectionHandler - disconnection with error code [5154]
Reproduction Steps
Relevant source files were provided to AWS support. This issue will also be linked there.
Possible Solution
I suspect a resource leak.
Additional Information/Context
No response
SDK version used
1.20.1
Environment details (OS name and version, etc.)
Mac 14.3.1
The text was updated successfully, but these errors were encountered: