Java has a much maligned (for good reason) serialization system built into the standard library. The output is a binary stream mapping the full object hierarchy and the relations between them.
The stream also includes definitions of classes and their hierarchies (super classes etc). The full specification is defined here.
In any new application there are probably better ways to serialize data with fewer security risks but there are cases where a legacy application is writing stuff out and we want to read it in again. If we want to read it in a separate application it'd be good if we weren't bound to Java.
I had one such application and rather than write more Java to interact with it, I wrote this.
java
import java.io.FileOutputStream;
import java.io.ObjectOutputStream;
import java.io.Serializable;
public class Demo implements Serializable {
private static final long serialVersionUID = 1L;
private String message;
private int i;
public Demo(String message, int count) {
this.message = message;
this.i = count;
}
public static void main(String[] args) throws Exception {
Demo d = new Demo("helloWorld", 42);
try (FileOutputStream fos = new FileOutputStream("demo.obj", false);
ObjectOutputStream oos = new ObjectOutputStream(fos);) {
oos.writeObject(d);
}
}
}
```rust use std::fs::File; use jaded::{Parser, Result};
fn main() -> Result<()> { let sample = File::open("demo.obj").expect("File missing"); let mut parser = Parser::new(sample)?; println!("Read Object: {:#?}", parser.read()?); Ok(()) } ```
Read Object: Object(
Object(
ObjectData {
class: "Demo",
fields: {
"i": Primitive(
Int(
42,
),
),
"message": JavaString(
"helloWorld",
),
},
annotations: [],
},
),
)
For most uses cases, the raw types read from the stream are not very ergonomic
to work with. For ease of use, types can implement FromValue
, and can then be
read directly from the stream.
Taking the same Java Demo
class defined and written above, the implementation
could look something like
```rust
struct Demo {
message: String,
i: i32,
}
impl FromValue for Demo {
fn fromvalue(value: &Value) -> ConversionResult
Demo objects can then read directly from the stream
let d: Demo = parser.readas()?;
```
Unfortunately, there are limits to what we can do without the original code
that created the serial byte stream. The protocol linked above lists four types
of object. One of which, classes that implement java.lang.Externalizable
and
use PROTOCOLVERSION1 (not been the default since v1.2), are not readable by
anything other than the class that wrote them as their data is nothing more
than a stream of bytes.
Of the remaining three types we can only reliably deserialize two.
'Normal' classes that implement java.lang.Serializable
without having
a writeObject method
These can be read as shown above
Classes that implement Externalizable and use the newer PROTOCOLVERSION2
These can be read, although their data is held fully by the annotations fields of the ObjectData struct and the get_field method only returns None.
Serializable classes that implement writeObject
These objects are more difficult. The spec above suggests that they have their fields written as 'normal' classes and then have optional annotations written afterwards. In practice this is not the case and the fields are only written if the class calls defaultWriteObject as the first call in their writeObject method. This is mentioned as a requirement in the spec so we can assume that this is correct for clases in the standard library but it is something to be aware of if user classes are being deserialized.
The consequence of this is that once we have found a class that we can't read, it is difficult to get back on track as it requires picking out the marker signifying the start of the next object from the sea of custom data.
In the future, there will hopefully be a method do define how customised classes should be read so that at least within a certain application where expected class types are known beforehand, all classes can be read.
It may also be possible to 'guess' how classes were written by making some assumptions and hoping that custom data doesn't look like stream markers. This method would be unreliable though and as such will only ever be an opt in process.
FromJava
trait that would allow a readObject<T: FromJava>()
method
would make the process more straight forward.
Very much a work in progress at the moment. I am writing this for another application I am working on so I imagine there will be many changes in the functionality and API at least in the short term as the requirements become apparent. As things settle down I hope things will become more stable.
As this project it is still very much in a pre-alpha state, I imagine things being quite unstable for a while. That said, if you notice anything obviously broken or have a feature that you think would be useful that I've missed entirely, do open issues. I'd avoid opening PRs until it's been discussed in an issue as the current repo state may lag behind development.