dump switched to v4 with scale modif dumped

jean-pierreBoth · Dec 19, 2024 · 1716275 · 1716275
1 parent a96fd6f
commit 1716275
Show file tree

Hide file tree

Showing 5 changed files with 21 additions and 14 deletions.
diff --git a/Changes.md b/Changes.md
@@ -2,7 +2,7 @@
 
   Possibility to reduce the number of levels used Hnsw structure with the function hnsw::modify_level_scale.
   This often increases significantly recall while incurring a moderate cpu cost. It is also possible
-  to have same recall with smaller max_nb_conn parameters so reducing memory usage.  
+  to have same recall with smaller *max_nb_conn* parameters so reducing memory usage.  
   See README.md at [bigann](https://github.com/jean-pierreBoth/bigann).  
   Modification inspired by the article by [Munyampirwa](https://arxiv.org/abs/2412.01940)
 

diff --git a/README.md b/README.md
@@ -79,8 +79,10 @@ Upon insertion, the level ***l*** of a new point is sampled with an exponential
 so that level 0 is the most densely populated layer, upper layers being exponentially less populated as level increases.  
 The nearest neighbour of the point is searched in lookup tables from the upper level to the level just above its layer (***l***), so we should arrive near the new point at its level at a relatively low cost. Then the ***max_nb_connection*** nearest neighbours are searched in neighbours of neighbours table (with a reverse updating of tables) recursively from its layer ***l*** down to the most populated level 0.  
 
-The scale parameter of the exponential law depends on the maximum number of connection possible for a point (parameter ***max_nb_connection***) to others.  
-Explicitly the scale parameter is chosen as : `scale=1/ln(max_nb_connection)`.
+The parameter of the exponential law to sample point levels is set to `ln(max_nb_connection)/scale`. 
+By default *scale* is set to 1. It is possible to reduce the *scale* parameter and thus reduce the number of levels used (See Hnsw::modify_level_scale) without increasing max_nb_connection.      
+This often provide beter recalls without increasing *max_nb_connection* which increase memory usage. (See examples)
+
 
 The main parameters occuring in constructing the graph or in searching are:
 
@@ -119,7 +121,7 @@ With a i9-13900HX 24 cores laptop we get the following results:
 3. sift1m benchmark: (1 million points in 128 dimension) search requests for the 10 first neighbours runs at 15000 req/s with a recall rate of 0.9907 or at 8300 req/s with a recall rate of 0.9959, depending on the parameters.
 
 Moreover a tiny crate [bigann](https://github.com/jean-pierreBoth/bigann)
-gives results on the first 10 Million points of the [BIGANN](https://big-ann-benchmarks.com/neurips21.html) or [IRISA](http://corpus-texmex.irisa.fr/)benchmark and can used to play with parameters on this data. Results give a recall between 0.92 and 0.99 depending on number of requests and parameters.
+gives results on the first 10 Million points of the [BIGANN](https://big-ann-benchmarks.com/neurips21.html) benchmark. The benchmark is also described at [IRISA](http://corpus-texmex.irisa.fr/). This crate can used to play with parameters on this data. Results give a recall between 0.92 and 0.99 depending on number of requests and parameters.
 
 Some lines extracted from this Mnist benchmark show how it works for f32 and L2 norm
 

diff --git a/examples/ann-mnist-784-euclidean.rs b/examples/ann-mnist-784-euclidean.rs
@@ -59,7 +59,8 @@ pub fn main() {
     let mut hnsw = Hnsw::<f32, DistL2>::new(max_nb_connection, nb_elem, nb_layer, ef_c, DistL2 {});
     hnsw.set_extend_candidates(false);
     //
-    //    hnsw.modify_level_scale(0.25);
+    hnsw.modify_level_scale(0.5);
+    //
     // parallel insertion
     let mut start = ProcessTime::now();
     let mut now = SystemTime::now();

diff --git a/src/hnsw.rs b/src/hnsw.rs
@@ -861,14 +861,16 @@ impl<'b, T: Clone + Send + Sync, D: Distance<T> + Send + Sync> Hnsw<'b, T, D> {
         self.datamap_opt
     }
 
-    /// By default the levels are sampled using an exponential law of parameter 1./ln(max_nb_conn)
-    /// so the number of levels used is around ln(max_nb_conn) + 1.  
-    /// Reducing the scale reduce the number of levels generated and can provide better precision (reduce memory but with some more cpu used)
+    /// By default the levels are sampled using an exponential law of parameter ln(max_nb_conn)
+    /// so the probability of having more than l levels decrease as exp(-l * ln(max_nb_conn))  
+    /// Reducing the scale change the parameter of the exponential to ln(max_nb_conn)/scale.
+    /// This reduce the number of levels generated and can provide better precision, reduce memory but with some more cpu used
     /// The factor must between 0.2 and 1.
-    // This is just to experiment
-    // parameters variations on the algorithm but not general use.
-    #[allow(unused)]
     pub fn modify_level_scale(&mut self, scale_modification: f64) {
+        //
+        if self.get_nb_point() > 0 {
+            println!("using modify_level_scale is possible at creation of a Hnsw structure to ensure coherence between runs")
+        }
         //
         let min_factor = 0.2;
         println!("\n  Current scale value : {:.2e}, Scale modification factor asked : {:.2e},(modification factor must be between {:.2e} and 1.)",

diff --git a/src/hnswio.rs b/src/hnswio.rs
@@ -874,7 +874,7 @@ impl Description {
     ///
     fn dump<W: Write>(&self, argmode: DumpMode, out: &mut BufWriter<W>) -> Result<i32> {
         info!("in dump of description");
-        out.write_all(&MAGICDESCR_3.to_ne_bytes())?;
+        out.write_all(&MAGICDESCR_4.to_ne_bytes())?;
         let mode: u8 = match argmode {
             DumpMode::Full => 1,
             _ => 0,
@@ -883,7 +883,9 @@ impl Description {
         out.write_all(&mode.to_ne_bytes())?;
         // dump of max_nb_connection as u8!!
         out.write_all(&self.max_nb_connection.to_ne_bytes())?;
-        // TODO: with MAGICDESCR_4 we must dump self.level_scale
+        // with MAGICDESCR_4 we must dump self.level_scale
+        out.write_all(&self.level_scale.to_ne_bytes())?;
+        //
         out.write_all(&self.nb_layer.to_ne_bytes())?;
         if self.nb_layer != NB_LAYER_MAX {
             println!("dump of Description, nb_layer != NB_MAX_LAYER");
@@ -1150,7 +1152,7 @@ where
     let v: Vec<T> = if std::any::TypeId::of::<T>() != std::any::TypeId::of::<NoData>() {
         match descr.format_version {
             2 => bincode::deserialize(&v_serialized).unwrap(),
-            3 => {
+            3 | 4 => {
                 let slice_t = unsafe {
                     std::slice::from_raw_parts(v_serialized.as_ptr() as *const T, descr.dimension)
                 };