Unlike other indexing software like Elasticsearch, Solr can be deployed in any Java Servlet container. The documentation provides an example with Apache Tomcat: this is our target environment at the MNHN. Just like a standard Tomcat installation, here comes the question of memory allocation to the JVM. Reading the Solr section on the topic, it was chosen to allow a large part of the RAM to the JVM (5GB on a total of 6: one single spared for the system).

We identified quickly a main problem: while indexing, the Solr engine issued timeout errors on select queries. A stackoverflow exchange describes a similar behavior. We noticed on the server side an important usage of virtual memory.

As I answered in the exchange, we solved this problem a few months ago after reading a blog post by Uwe Schindler (a Solr committer). With Solr 4 and several Solr 3 versions, you have to let an important share of your RAM free so that the system can use properly the mmap system call. This is due to the introduction in Solr of a new Lucene component: the MMapDirectory. The blog post gives a plenty of informations on the system configuration. In our case, this solved the problem: we could finally index without any more timeout issue.