Skip to content

[Java] Set deserialized as ArrayList in XLANG mode disrupts collection semantics #2105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 tasks done
LouisLou2 opened this issue Mar 16, 2025 · 0 comments
Open
2 tasks done
Labels
bug Something isn't working

Comments

@LouisLou2
Copy link
Contributor

Search before asking

  • I had searched in the issues and found no similar issues.

Version

0.10.0

Component(s)

Java

Minimal reproduce step

package org.apache.fury;

import org.apache.fury.config.Language;

import java.util.HashSet;
import java.util.Set;

public class Main {
  public static class SomeClass {
    Set<String> set = new HashSet<>();
  }

  public static void main(String[] args) {
    Fury fury = Fury.builder()
            .withLanguage(Language.XLANG)
            .withRefTracking(true)
            .ignoreBasicTypesRef(true)
            .ignoreTimeRef(true)
            .ignoreStringRef(false)
            .build();
    fury.register(SomeClass.class, "MySomeClass");

    SomeClass someClass = new SomeClass();
    byte[] bytes = fury.serialize(someClass);
    SomeClass obj = (SomeClass)fury.deserialize(bytes);
    System.out.println(obj.set.getClass());
  }
}

What did you expect to see?

I expected the deserialized object to maintain the same collection type as the original object - a HashSet in this case. Since the field is declared as a Set, it should be deserialized as a Set implementation like HashSet, preserving the collection's semantics (uniqueness of elements, hash-based operations, etc.).

What did you see instead?

The deserialized object contains an ArrayList instead of a HashSet:

class java.util.ArrayList

The Set is converted to an ArrayList during deserialization, which breaks the expected behavior of the Set interface (no duplicate elements, different performance characteristics for operations like contains()).

Anything Else?

I've traced the issue to the StructSerializer::getGenericType method which contains the following logic:

if (resolver.isMap(cls)) {
  t.setSerializer(
      ReflectionUtils.isAbstract(cls)
          ? new MapSerializer(fury, HashMap.class)
          : resolver.getSerializer(cls));
} else if (resolver.isCollection(cls)) {
  t.setSerializer(
      ReflectionUtils.isAbstract(cls)
          ? new CollectionSerializer(fury, ArrayList.class)
          : resolver.getSerializer(cls));
} else if (cls.isArray()) {
  t.setSerializer(new ArraySerializers.ObjectArraySerializer(fury, cls));
}

When the serializer detects a field is a Collection and its class is abstract (which includes interfaces like Set), it defaults to using ArrayList as the implementation rather than preserving the original collection type.

While this approach makes the code work (the elements are preserved), it changes the semantics of the collection. This can lead to unexpected behavior for code that relies on specific collection implementations or interfaces.

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@LouisLou2 LouisLou2 added the bug Something isn't working label Mar 16, 2025
@LouisLou2 LouisLou2 changed the title Set deserialized as ArrayList in XLANG mode disrupts collection semantics [Java] Set deserialized as ArrayList in XLANG mode disrupts collection semantics Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant