Spring Data MongoDB – remove '_class’ field or define explicitly by @DocumentType

Spring Data with MongoDB support empowers many of applications which uses NoSQL approach to store data. However, it still tries to map abstract raw documents into strongly typed Java Objects.

NoSQL documents brings no information about it’s structure, it keeps simple data like values, arrays and nested documents. In Java code it is possible to make an aggregate (instance of Object in Object) which corresponds nested documents in NoSQL structure. This can save some extra joins or additional queries while One-To-Many and One-To-One relations known from relational data model.

Serializing objects is possible while the object structure is known. While you save the object to database it is simply serialized into JSON string.

There is no problem while you fetch the JSON objects from database and assign to the known Java structure as well. Sometimes in Objective programming approach objects are extended by other classes with extended structure. For example there is object A and object B (extending A without any extra fields). In this point while you query for all objects of type A (type A and B should be returned, because B is also A) there is no way to determine object’s type while deserialization. This is known as polymorphism in Object oriented programming.

To solve that issue Spring Data Mongo Template adds an extra _class field where the canonical name of class is stored.

In many cases we do not use polymorphism in out data model and adding extra _class field is disk space loss.

Avoid storing _class field

To avoid storing _class field, simply innject DefaultMongoTypeMapper with extra class field name set to null in into MongoTemplate:

@Configuration
@EnableMongoRepositories(basePackages="example")
public class AuthDataSourceConfiguration {

	// ...
    
	@Bean
	public MongoClient mongoDbClient() throws Exception {
		return new MongoClient(new ServerAddress("127.0.0.1"));
	}

	@Bean
	public MongoDbFactory mongoDbFactory() throws Exception {
		return new SimpleMongoDbFactory(mongoDbClient(), "dbname");
	}

	@Bean
	public MongoTemplate mongoTemplate() throws Exception {
		MongoTypeMapper typeMapper = new DefaultMongoTypeMapper(null);
        MappingMongoConverter converter = new MappingMongoConverter(mongoDbFactory(), new MongoMappingContext());
        converter.setTypeMapper(typeMapper);
        
		MongoTemplate mongoTemplate = new MongoTemplate(mongoDbFactory(), converter);
		return mongoTemplate;
	}
}

This solution will completely ignore the extra type value.

Caution! You cannot deserialize objects which extends another objects anymore!

Customize _class field value depending on Object type

Another approach is explictly define the value of _class field for specified types. Some types should have an information about type, some not.

We will try achieve:

  1. Still keep polymorphism in some cases
  2. We will explictly define whether classes should have an info about Java type.
  3. The Java type should be configurable and shorter than fully qualified class name and indepedent from source code.
  4. The _class field name should be also configured.

Ad1. Ad2. To specify how types are mapped to aliases and vice-versa, you have to implement TypeInformationMapper interface or use one of existing, such us ConfigurableTypeInformationMapper which expects Map<ClassTypeInformation<?>, Object>. This is easy and convenient way to map class to some key (not neccesery a fully qualified class name). If some mapping does not exists, it’s alias is null and will not be inserted into document.

Ad3. The field name is provided by MongoTypeMapper (in our case DefaultMongoTypeMapper):

	// ...
	
	@Bean
	public MongoTemplate mongoTemplate() throws Exception {
		TypeInformationMapper typeMapper1 = ...;
		MongoTypeMapper typeMapper = new DefaultMongoTypeMapper(DefaultMongoTypeMapper.DEFAULT_TYPE_KEY, Arrays.asList(typeMapper1));;
		MappingMongoConverter converter = new MappingMongoConverter(mongoDbFactory(), new MongoMappingContext());
		converter.setTypeMapper(typeMapper);
        	
		MongoTemplate mongoTemplate = new MongoTemplate(mongoDbFactory(), converter);
		return mongoTemplate;
	}

We assumed default value DefaultMongoTypeMapper.DEFAULT_TYPE_KEY.

We will implement the mechanism based on annotations which scans packages and expects @DocumentType("aliasValue").

First, define our annotation:

import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.TYPE)
public @interface DocumentType {
	
	public String value() default "";

}

And second, create custom TypeInformationMapper based on packages scanning and lookup for annotations:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import org.springframework.beans.factory.config.BeanDefinition;
import org.springframework.context.annotation.ClassPathScanningCandidateComponentProvider;
import org.springframework.core.type.filter.AnnotationTypeFilter;
import org.springframework.data.convert.TypeInformationMapper;
import org.springframework.data.util.ClassTypeInformation;
import org.springframework.data.util.TypeInformation;

/**
 * @author Piotr `Athlan` Pelczar
 */
public class AnnotationTypeInformationMapper implements TypeInformationMapper {

	private final Map, String> typeToAliasMap;
	private final Map> aliasToTypeMap;
	
	private AnnotationTypeInformationMapper(List basePackagesToScan) {
		typeToAliasMap = new HashMap<>();
		aliasToTypeMap = new HashMap<>();
		
		populateTypeMap(basePackagesToScan);
	}
	
	private void populateTypeMap(List basePackagesToScan) {
		ClassPathScanningCandidateComponentProvider scanner = new ClassPathScanningCandidateComponentProvider(false);
		
		scanner.addIncludeFilter(new AnnotationTypeFilter(DocumentType.class));
		
		for (String basePackage : basePackagesToScan) {
			for (BeanDefinition bd : scanner.findCandidateComponents(basePackage)) {
				try {
					Class< ?> clazz = Class.forName(bd.getBeanClassName());
					DocumentType doumentTypeAnnotation = clazz.getAnnotation(DocumentType.class);
					
					ClassTypeInformation< ?> type = ClassTypeInformation.from(clazz);
					String alias = doumentTypeAnnotation.value();
					
					typeToAliasMap.put(type, alias);
					aliasToTypeMap.put(alias, type);
					
				} catch (ClassNotFoundException e) {
					throw new IllegalStateException(String.format("Class [%s] could not be loaded.", bd.getBeanClassName()), e);
				}
			}
		}
	}

	/* 
	 * (non-Javadoc)
	 * @see org.springframework.data.convert.TypeInformationMapper#createAliasFor(org.springframework.data.util.TypeInformation)
	 */
	public Object createAliasFor(TypeInformation< ?> type) {
		ClassTypeInformation< ?> typeClass = (ClassTypeInformation< ?>) type;
		
		if(typeToAliasMap.containsKey(typeClass)) {
			return typeToAliasMap.get(typeClass);
		}
		
		return null;
	}

	/*
	 * (non-Javadoc)
	 * @see org.springframework.data.convert.TypeInformationMapper#resolveTypeFrom(java.lang.Object)
	 */
	public ClassTypeInformation< ?> resolveTypeFrom(Object alias) {

		if(aliasToTypeMap.containsKey(alias)) {
			return aliasToTypeMap.get(alias);
		}
		
		return null;
	}

	public static class Builder {
		List basePackagesToScan;
		
		public Builder() {
			basePackagesToScan = new ArrayList<>();
		}
		
		public Builder withBasePackage(String basePackage) {
			basePackagesToScan.add(basePackage);
			return this;
		}
		
		public Builder withBasePackages(String[] basePackages) {
			basePackagesToScan.addAll(Arrays.asList(basePackages));
			return this;
		}
		
		public Builder withBasePackages(Collection< ? extends String> basePackages) {
			basePackagesToScan.addAll(basePackages);
			return this;
		}
		
		public AnnotationTypeInformationMapper build() {
			AnnotationTypeInformationMapper builded = new AnnotationTypeInformationMapper(basePackagesToScan);
			
			return builded;
		}
	}
}

The usage is simple:

	@Bean
	public MongoTemplate mongoTemplate() throws Exception {
		String[] basePackages = new String[] {"selly"};
		TypeInformationMapper typeMapper1 = new AnnotationTypeInformationMapper.Builder().withBasePackages(basePackages).build();
		
		MongoTypeMapper typeMapper = new DefaultMongoTypeMapper(DefaultMongoTypeMapper.DEFAULT_TYPE_KEY, Arrays.asList(typeMapper1));
		MappingMongoConverter converter = new MappingMongoConverter(mongoDbFactory(), new MongoMappingContext());
		converter.setTypeMapper(typeMapper);
        
		MongoTemplate mongoTemplate = new MongoTemplate(mongoDbFactory(), converter);
		return mongoTemplate;
	}

Complete Gist:
https://gist.github.com/athlan/6497c74cc515131e1336

Hope it helped 🙂

 

Double-checked locking with Singleton pattern in Java

I just faced problem with synchronization many threads starting at the same time (with microseconds difference) and creating single object instance of connection to the database using a Singleton Pattern in Java. As a result I had many connections except one. The sent queries counter has been set to smaller value as excepted in simulations.

I have just Google’d the IBM article by Peter Haggar, Senior Software Engineer „Double-checked locking and the Singleton pattern”.

Problem overview

Creating an singleton in Java is simple to implement. There are two common ways to create singleton:

  1. Lazy loaded with create an private static field _instance filled by null (by default Java object initialization). The instance is created, when the static method getInstance() is called.
  2. Create an class instance in advance, just before class is loaded to memory by declaring a value of priate static field _instance by calling the private constructor new SingletonClass();

1st implementation with lazy initialization

package pl.athlan.examples.singleton;

public class Singleton {
	private static Singleton _instance; // null by default

	private Singleton() {
	}

	public static Singleton getInstance() {
		if(_instance == null) {
			_instance = new Singleton();
		}

		return _instance;
	}
}

2nd implementation with eager initialization

package pl.athlan.examples.singleton;

public class Singleton {
	private static Singleton _instance = new Singleton(); // object is created just after class is loaded into memory

	private Singleton() {
	}

	public static Singleton getInstance() {
		return _instance;
	}
}

Motivation.

Imagine two separated threads with is delegated to call getInstance() method at the same time.

Thread #1 Thread #2 value of _instance
Singleton.getInstance() null
Singleton.getInstance() null
if(_instance == null) null
if(_instance == null) null
_instance = new Singleton() [object #1]
_instance = new Singleton() [object #2]

As a result, two object has been created, because thread #2 hasn’t noticed the object creation.

If your object stores common data like a (in my case) database queries counter or the creation of the object is time-expensive when the system just hang out for many threads – this situation have not to occur.

Sloving the problem.

The problem slove is to synchronize the threads while accessing getInstace method. You can simply write:

public static synchronized Singleton getInstance()

but this solution produces an huge overhead to synchronize all threads calling this method. The better solution is to synchronize the fragment of code which checks an existance and creates the object in fact, except of returing if it already exists.

Finally solution:

package pl.athlan.examples.singleton;

public class Singleton {
	private volatile static Singleton _instance;

	private Singleton() {
	}

	public static Singleton getInstance() {
		if(_instance == null) {

			// causes that this block will be processed in sequence in parallel computing mode
			synchronized(Singleton.class) {

				// if previous sequence created the instance, just omit object creation
				if(_instance == null) {
					_instance = new Singleton();
				}
			}
		}

		return _instance;
	}
}

The volatile keyword assigned to _instance field provides the synchronization.

If there is no instance of the object, the synchronized block will begin. It means that all processes are queued to access that block. After access just ensure one more time, if the single object is not exists in fact, because the process doesn’t know what happened before it has rached the queue. If any process before queueing has created the object, just ommit the creation.

Hope it helped!

NOTE: Note that implementing Singleton by an ENUM is thread-safe and reflection-safe.