Websites with a search engine that allows users to enter one or more terms and search for these within the content of a database are becoming increasingly numerous. The success of an online store – with all product types, be it food products, health products, books, travel tickets, etc. – is based on its ability to return results from the database, which can be the difference between a profit and a lost sale.

APIsGroups

Many websites (some very well-known and used a great deal) try to use techniques such as “pluralizing” and “singling out” the user’s request. For example, if the user wants to find “fresas”, the system detects an “s” at the end of the word and “singles it out”, which leaves us with “fres”, as it considers the “as” as the plural element of the word.

API Word SearchesNext, the system searches for all products that contain the word “fres” and therefore the results are mixed and diverse (except this is too much information for the customer to sift through in a supermarket, for example): “pescadilla fresca” (fresh whiting), “limpiador fresco” (cleaning product), “refresco naranja” (orange soda), and many more.

We are trying to implement a system in order to retrieve the most precise results with the help of Apicultur’s APIs and in particular, the lemmatization one. You can discover more of what it’s all about in this post: Why is lemmatization useful?

And now that we have looked at the current situation, we are going to have a look at a practical way of dealing with this by using our website.

To do so, you must complete the following steps:

  1. Create an Apicultur account and subscribe to the lemmatization APIs.
  2. Select the database fields on which to carry out the text search.
  3. Process each of the words from these fields and store the result of the process in a new field associated with it.
  4. Enable the user to choose between a “normal” and a “lemmatized” search (totally optional).
  5. Process the request of the user (“lemmatize it”) and find the registers that the processed requests contain in the new field.

Here we will go step by step explaining the process in more detail:

1. Creation of an Apicultur account

To create an API user account, you have to complete a number of important steps: to access the API store, register as a user, subscribe to the lemmatization API and to obtain the API key for your use. If you want more detailed information about this process, you can find it here:

2. Select the fields

In this step we are selecting the field or the fields from the database that contain the name of the product and/or the description of the product. We will create a new field in order to store the lemmatized values of said fields, name_product_lemma and description_product_lemma.

3. Process the words

In order to illustrate this process we are going to use PHP, although it is fully exportable to any other programming language.

We must create a variable that stores our AccessKey and the URL of the API that we are going to use, in this case the lemmatization one. We then add to the URL the word whose lemma we want to know:

[php autolinks=”false” gutter=”true” collapse=”false” firstline=”1″ highlight=”true” light=”true” smarttabs=”true” tabsize=”4″ toolbar=”true”]
$access_key = "7Qdfdf36HKIEsg7XsUKNsaqFx2sB1"; #APIKey de nuestra aplicación en APICultur. Para más informacion: http://www.apicultur.com/instrucciones/
$url="http://store.apicultur.com/api/lematiza-clasico/1.0.0/".$palabra;
[/php]

We then prepare a connection to make the request:

[php autolinks=”false” gutter=”true” collapse=”false” firstline=”4″ highlight=”true” light=”true” smarttabs=”true” tabsize=”4″ toolbar=”true”]
#Iniciamos curl
         $ch = curl_init();

         #Pasamos nuestro API Key y señalamos que lo que nos van a devolver es JSON
         curl_setopt($ch,CURLOPT_HTTPHEADER,array( ‘Accept: application/json’, ‘Authorization: Bearer ‘ . $access_key ));

         #Pasamos la url de la api
         curl_setopt($ch, CURLOPT_URL, $url);
         curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

         #Introducimos en una variable el valor que nos devuelve la api
         $respuesta = curl_exec($ch);

         $http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE);

         #Cerrar el recurso cURL y liberar recursos
         curl_close($ch);

         #Comprobamos el código devuelto por la API para ver que todo ha salido correctamente y en caso positivo devolvemos ok
         switch ($http_status) {
    case ‘200’:
                 $lemas = json_decode($respuesta);
           break;
    default:
           echo "
Error al lematizar la palabra:" .$palabra ;
          echo "
Error:" .$http_status ;
      break;
         }
[/php]

Now we can access the different lemmas from the word that we have introduced:

[php autolinks=”false” gutter=”true” collapse=”false” firstline=”35″ highlight=”true” light=”true” smarttabs=”true” tabsize=”4″ toolbar=”true”]
if (isset($lemas)){

                 $laPalabra = $lemas->{‘palabra’};
                 $losLemas = $lemas->{‘lemas’};

                 echo "
<b>La palabra: " .$laPalabra."</b>";

                 foreach ($losLemas as $lema){

                          $elLema = $lema->{‘lema’};
                          $laCategoria = $lema->{‘categoria’};
                          echo "
Lema: " .$elLema;
                          echo "- la categoria: " .$laCategoria;
                 }
         }
[/php]

We repeat this for every word that contains the register that we are lemmatizing. Once all the words from a register are lemmatized, we can update the same with the lemmas that correspond to their title and description in the new fields created for such an effect.

4. Enable the selection of the type of search to carry out (optional)

This step is entirely optional and involves enabling the user to choose between a normal search, which searches in the untreated titles and descriptions, and the new search.

5. Process the user request

Finally, we are going to modify the process of searching our website, instead of collecting the user’s search term directly and looking for it in our title or description field, we are going to “lemmatize it” and then search each of the retrieved lemmas within the new columns name_product_lemma and description_product_lemma.

In order to carry out this operation, we can use the same code as shown before, the “word” option being the term for which the user is searching.

Conclusions

As you will see, lemmatization is a process that, although it adds a small delay in your requests, gives much more full and concise results to a search.

In the next instalment, we will show you a new API that enables this search operation with the need to “lemmatize” your whole database beforehand.

In the next link, you will have the link to the example php code used in this post: php code.

Recommended links
Tagged with →  
Share →