Elasticsearch python API overview¶

In [1]:

Copied!

import warnings
from elasticsearch import Elasticsearch, RequestsHttpConnection
warnings.filterwarnings('ignore')
import warnings
from elasticsearch import Elasticsearch, RequestsHttpConnection
warnings.filterwarnings('ignore')

Avant de commencer¶

Lancer elasticsearch avec docker¶

Pour ce faire, on va run un cluster elastic dans un container. Si vous n'avez pas deja l'image elastic dans votre registery local il faut la pull du hub avec la commande suivante:

docker pull docker.elastic.co/elasticsearch/elasticsearch:7.11.1

puis on run le container sur le port 9200 tel que:

docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.11.1

Lancer elasticsearch avec docker-compose¶

On peut aussi lancer plusieurs noeud au sein d'un meme cluster avec docker-compose tel que

version: '2.2'
services:
  es01:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.11.1
    container_name: es01
    environment:
      - node.name=es01
      - cluster.name=es-docker-cluster
      - discovery.seed_hosts=es02,es03
      - cluster.initial_master_nodes=es01,es02,es03
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - data01:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
    networks:
      - elastic
  es02:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.11.1
    container_name: es02
    environment:
      - node.name=es02
      - cluster.name=es-docker-cluster
      - discovery.seed_hosts=es01,es03
      - cluster.initial_master_nodes=es01,es02,es03
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - data02:/usr/share/elasticsearch/data
    networks:
      - elastic
  es03:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.11.1
    container_name: es03
    environment:
      - node.name=es03
      - cluster.name=es-docker-cluster
      - discovery.seed_hosts=es01,es02
      - cluster.initial_master_nodes=es01,es02,es03
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - data03:/usr/share/elasticsearch/data
    networks:
      - elastic

volumes:
  data01:
    driver: local
  data02:
    driver: local
  data03:
    driver: local

networks:
  elastic:
    driver: bridge

Plus d'info sur le doc ici

🚧Attention à votre configuration Docker 🚧¶

Elastic demande beaucoup de ressource à votre docker (et donc à votre machine) il faut avoir au moins configurer 4GB de memoire que Docker peut utiliser. Vous pouvez aussi changer directement la configuration de la JVM des container avec le paramètre ES_JAVA_OPTS=-Xms512m -Xmx512m et le passer à 256m ou bien 128m.

📟 Exercice [optionnel]¶

Ecrire un fichier docker-compose.yml avec un service Elasticsearch sur le port 9200 (un seul noeud) et un service Kibana sur le port 5601 ainsi qu'un network elnet

Ping du container¶

In [2]:

Copied!

import requests
res = requests.get('http://localhost:9200?pretty')
print(res.content)
import requests
res = requests.get('http://localhost:9200?pretty')
print(res.content)

b'{\n  "name" : "3935d86abc4c",\n  "cluster_name" : "docker-cluster",\n  "cluster_uuid" : "pFca2ZHUT8KQpVXk2dIAzg",\n  "version" : {\n    "number" : "7.11.1",\n    "build_flavor" : "default",\n    "build_type" : "docker",\n    "build_hash" : "ff17057114c2199c9c1bbecc727003a907c0db7a",\n    "build_date" : "2021-02-15T13:44:09.394032Z",\n    "build_snapshot" : false,\n    "lucene_version" : "8.7.0",\n    "minimum_wire_compatibility_version" : "6.8.0",\n    "minimum_index_compatibility_version" : "6.0.0-beta1"\n  },\n  "tagline" : "You Know, for Search"\n}\n'

In [3]:

Copied!

es = Elasticsearch('http://localhost:9200')
es = Elasticsearch('http://localhost:9200')

Create, delete and verify index¶

#create
es.indices.create(index="first_index",ignore=400)

#verify
print es.indices.exists(index="first_index")

#delete
print es.indices.delete(index="first_index", ignore=[400,404])

Insert documents¶

#documents to insert in the elasticsearch index "cities"
doc1 = {"city":"New Delhi", "country":"India"}
doc2 = {"city":"London", "country":"England"}
doc3 = {"city":"Los Angeles", "country":"USA"}

#Inserting doc1 in id=1
es.index(index="cities", doc_type="places", id=1, body=doc1)

#Inserting doc2 in id=2
es.index(index="cities", doc_type="places", id=2, body=doc2)

#Inserting doc3 in id=3
es.index(index="cities", doc_type="places", id=3, body=doc3)

📟 Exercice [optionnel]¶

Trouver la fonction qui vérifie que votre index est bien crée.

In [4]:

Out[4]:

True

Retrieve data with id : `get`¶

In [5]:

Copied!

res = es.get(index="cities", doc_type="places", id=2)
res
res = es.get(index="cities", doc_type="places", id=2)
res

Out[5]:

{'_id': '2',
 '_index': 'cities',
 '_primary_term': 1,
 '_seq_no': 1,
 '_source': {'city': 'London', 'country': 'England'},
 '_type': 'places',
 '_version': 1,
 'found': True}

📟 Exercice [optionnel]¶

Afficher uniquement les informations ci-dessous à partir de la variable res

In [6]:

Out[6]:

{'city': 'London', 'country': 'England'}

Mapping¶

In [7]:

Copied!

es.indices.get_mapping(index='cities')
es.indices.get_mapping(index='cities')

Out[7]:

{'cities': {'mappings': {'properties': {'city': {'fields': {'keyword': {'ignore_above': 256,
       'type': 'keyword'}},
     'type': 'text'},
    'country': {'fields': {'keyword': {'ignore_above': 256,
       'type': 'keyword'}},
     'type': 'text'}}}}}

More about mappings: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

Le endpoint `_search` et les `query`¶

Pour la suite des exemple assurez vous d'avoir importer les data via la _bulk api

In [8]:

Copied!

res = es.search(index="cities", body={"query":{"match_all":{}}})
res
res = es.search(index="cities", body={"query":{"match_all":{}}})
res

Out[8]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 1, 'total': 1},
 'hits': {'hits': [{'_id': '1',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'New Delhi', 'country': 'India'},
    '_type': 'places'},
   {'_id': '2',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'London', 'country': 'England'},
    '_type': 'places'},
   {'_id': '3',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'Los Angeles', 'country': 'America Bitch'},
    '_type': 'places'}],
  'max_score': 1.0,
  'total': {'relation': 'eq', 'value': 3}},
 'timed_out': False,
 'took': 5}

📟 Exercice [optionnel]¶

Afficher uniquement les informations ci-dessous à partir de la variable res

In [9]:

Out[9]:

[{'_id': '1',
  '_index': 'cities',
  '_score': 1.0,
  '_source': {'city': 'New Delhi', 'country': 'India'},
  '_type': 'places'},
 {'_id': '2',
  '_index': 'cities',
  '_score': 1.0,
  '_source': {'city': 'London', 'country': 'England'},
  '_type': 'places'},
 {'_id': '3',
  '_index': 'cities',
  '_score': 1.0,
  '_source': {'city': 'Los Angeles', 'country': 'America Bitch'},
  '_type': 'places'}]

Affiner ces critères de recherche avec `_source`¶

In [10]:

Copied!





es.search(index="movies", body={
  "_source": {
    "includes": [
      "*.title",
      "*.directors"
    ],
    "excludes": [
      "*.actors*",
      "*.genres"
    ]
  },
  "query": {
    "match": {
      "fields.directors": "George"
    }
  }
})
es.search(index="movies", body={
  "_source": {
    "includes": [
      "*.title",
      "*.directors"
    ],
    "excludes": [
      "*.actors*",
      "*.genres"
    ]
  },
  "query": {
    "match": {
      "fields.directors": "George"
    }
  }
})

Out[10]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 5, 'total': 5},
 'hits': {'hits': [{'_id': '475',
    '_index': 'movies',
    '_score': 5.6268926,
    '_source': {'fields': {'directors': ['George Clooney'],
      'title': 'The Monuments Men'}},
    '_type': 'movie'},
   {'_id': '1183',
    '_index': 'movies',
    '_score': 5.6268926,
    '_source': {'fields': {'directors': ['George Nolfi'],
      'title': 'The Adjustment Bureau'}},
    '_type': 'movie'},
   {'_id': '4150',
    '_index': 'movies',
    '_score': 5.6268926,
    '_source': {'fields': {'directors': ['Terry George'],
      'title': 'Reservation Road'}},
    '_type': 'movie'},
   {'_id': '3378',
    '_index': 'movies',
    '_score': 4.881689,
    '_source': {'fields': {'directors': ['George Miller', 'George Ogilvie'],
      'title': 'Mad Max Beyond Thunderdome'}},
    '_type': 'movie'},
   {'_id': '226',
    '_index': 'movies',
    '_score': 4.719993,
    '_source': {'fields': {'directors': ['George Lucas'],
      'title': 'Star Wars'}},
    '_type': 'movie'},
   {'_id': '690',
    '_index': 'movies',
    '_score': 4.719993,
    '_source': {'fields': {'directors': ['George Clooney'],
      'title': 'The Ides of March'}},
    '_type': 'movie'},
   {'_id': '1165',
    '_index': 'movies',
    '_score': 4.719993,
    '_source': {'fields': {'directors': ['George Lucas'],
      'title': 'American Graffiti'}},
    '_type': 'movie'},
   {'_id': '3022',
    '_index': 'movies',
    '_score': 4.719993,
    '_source': {'fields': {'directors': ['George Lucas'],
      'title': 'THX 1138'}},
    '_type': 'movie'},
   {'_id': '3715',
    '_index': 'movies',
    '_score': 4.719993,
    '_source': {'fields': {'directors': ['George Gallo'],
      'title': 'Middle Men'}},
    '_type': 'movie'},
   {'_id': '4639',
    '_index': 'movies',
    '_score': 4.719993,
    '_source': {'fields': {'directors': ['George Miller'], 'title': 'Andre'}},
    '_type': 'movie'}],
  'max_score': 5.6268926,
  'total': {'relation': 'eq', 'value': 56}},
 'timed_out': False,
 'took': 87}

Logique booléenne¶

In [11]:

Copied!





es.search(index="movies", body=
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "fields.directors": "George"
          }
        },
        {
          "match": {
            "fields.title": "Star Wars"
          }
        }
      ]
    }
  }
})
es.search(index="movies", body=
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "fields.directors": "George"
          }
        },
        {
          "match": {
            "fields.title": "Star Wars"
          }
        }
      ]
    }
  }
})

Out[11]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 5, 'total': 5},
 'hits': {'hits': [{'_id': '226',
    '_index': 'movies',
    '_score': 16.046509,
    '_source': {'fields': {'actors': ['Mark Hamill',
       'Harrison Ford',
       'Carrie Fisher'],
      'directors': ['George Lucas'],
      'genres': ['Action', 'Adventure', 'Fantasy', 'Sci-Fi'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMTU4NTczODkwM15BMl5BanBnXkFtZTcwMzEyMTIyMw@@._V1_SX400_.jpg',
      'plot': "Luke Skywalker joins forces with a Jedi Knight, a cocky pilot, a wookiee and two droids to save the universe from the Empire's world-destroying battle-station, while also attempting to rescue Princess Leia from the evil Darth Vader.",
      'rank': 226,
      'rating': 8.7,
      'release_date': '1977-05-25T00:00:00Z',
      'running_time_secs': 7260,
      'title': 'Star Wars',
      'year': 1977},
     'id': 'tt0076759',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '469',
    '_index': 'movies',
    '_score': 10.593456,
    '_source': {'fields': {'actors': ['Ewan McGregor',
       'Liam Neeson',
       'Natalie Portman'],
      'directors': ['George Lucas'],
      'genres': ['Action', 'Adventure', 'Fantasy', 'Sci-Fi'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMTQ4NjEwNDA2Nl5BMl5BanBnXkFtZTcwNDUyNDQzNw@@._V1_SX400_.jpg',
      'plot': 'Two Jedi Knights escape a hostile blockade to find allies and come across a young boy who may bring balance to the Force, but the long dormant Sith resurface to reclaim their old glory.',
      'rank': 469,
      'rating': 6.5,
      'release_date': '1999-05-19T00:00:00Z',
      'running_time_secs': 8160,
      'title': 'Star Wars: Episode I - The Phantom Menace',
      'year': 1999},
     'id': 'tt0120915',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '371',
    '_index': 'movies',
    '_score': 10.064606,
    '_source': {'fields': {'actors': ['Hayden Christensen',
       'Natalie Portman',
       'Ewan McGregor'],
      'directors': ['George Lucas'],
      'genres': ['Action', 'Adventure', 'Fantasy', 'Sci-Fi'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BNTc4MTc3NTQ5OF5BMl5BanBnXkFtZTcwOTg0NjI4NA@@._V1_SX400_.jpg',
      'plot': "After three years of fighting in the Clone Wars, Anakin Skywalker falls prey to the Sith Lord's lies and makes an enemy of the Jedi and those he loves, concluding his journey to the Dark Side.",
      'rank': 371,
      'rating': 7.7,
      'release_date': '2005-05-15T00:00:00Z',
      'running_time_secs': 8400,
      'title': 'Star Wars: Episode III - Revenge of the Sith',
      'year': 2005},
     'id': 'tt0121766',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '922',
    '_index': 'movies',
    '_score': 10.064606,
    '_source': {'fields': {'actors': ['Hayden Christensen',
       'Natalie Portman',
       'Ewan McGregor'],
      'directors': ['George Lucas'],
      'genres': ['Action', 'Adventure', 'Fantasy', 'Sci-Fi'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMTY5MjI5NTIwNl5BMl5BanBnXkFtZTYwMTM1Njg2._V1_SX400_.jpg',
      'plot': 'Ten years later, Anakin Skywalker shares a forbidden romance with Padmé, while Obi-Wan investigates an assassination attempt on the Princess and discovers a secret clone army crafted for the Jedi.',
      'rank': 922,
      'rating': 6.7,
      'release_date': '2002-05-16T00:00:00Z',
      'running_time_secs': 8520,
      'title': 'Star Wars: Episode II - Attack of the Clones',
      'year': 2002},
     'id': 'tt0121765',
     'type': 'add'},
    '_type': 'movie'}],
  'max_score': 16.046509,
  'total': {'relation': 'eq', 'value': 4}},
 'timed_out': False,
 'took': 35}

Les critères : SHOULD / MUST¶

In [12]:

Copied!





es.search(index="movies", body=
{
  "query": {
    "bool": {
      "must": [
                  { "match": { "fields.title": "Star Wars"}}
                  
      ],
      "must_not": { "match": { "fields.directors": "George Miller" }},
      "should": [
                  { "match": { "fields.title": "Star" }},
                  { "match": { "fields.directors": "George Lucas"}}
      ]
    }
  }
})
es.search(index="movies", body=
{
  "query": {
    "bool": {
      "must": [
                  { "match": { "fields.title": "Star Wars"}}
                  
      ],
      "must_not": { "match": { "fields.directors": "George Miller" }},
      "should": [
                  { "match": { "fields.title": "Star" }},
                  { "match": { "fields.directors": "George Lucas"}}
      ]
    }
  }
})

Out[12]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 5, 'total': 5},
 'hits': {'hits': [{'_id': '2509',
    '_index': 'movies',
    '_score': 14.557282,
    '_source': {'fields': {'actors': ['Mark Wahlberg',
       'Jennifer Aniston',
       'Dominic West'],
      'directors': ['Stephen Herek'],
      'genres': ['Comedy', 'Drama', 'Music'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMjE4NTYyNTQ0M15BMl5BanBnXkFtZTcwNDYwMTAyMQ@@._V1_SX400_.jpg',
      'plot': 'Lead singer of a tribute band becomes lead singer of the real band he idolizes.',
      'rank': 2509,
      'rating': 5.9,
      'release_date': '2001-09-04T00:00:00Z',
      'running_time_secs': 6300,
      'title': 'Rock Star',
      'year': 2001},
     'id': 'tt0202470',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '168',
    '_index': 'movies',
    '_score': 12.612011,
    '_source': {'fields': {'actors': ['Mark Hamill',
       'Harrison Ford',
       'Carrie Fisher'],
      'directors': ['J.J. Abrams'],
      'genres': ['Action', 'Adventure', 'Fantasy', 'Sci-Fi'],
      'plot': 'A continuation of the saga created by George Lucas.',
      'rank': 168,
      'release_date': '2015-01-01T00:00:00Z',
      'title': 'Star Wars: Episode VII',
      'year': 2015},
     'id': 'tt2488496',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '128',
    '_index': 'movies',
    '_score': 11.051997,
    '_source': {'fields': {'actors': ['Chris Pine',
       'Zachary Quinto',
       'Simon Pegg'],
      'directors': ['J.J. Abrams'],
      'genres': ['Action', 'Adventure', 'Sci-Fi'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMjE5NDQ5OTE4Ml5BMl5BanBnXkFtZTcwOTE3NDIzMw@@._V1_SX400_.jpg',
      'plot': "The brash James T. Kirk tries to live up to his father's legacy with Mr. Spock keeping him in check as a vengeful, time-traveling Romulan creates black holes to destroy the Federation one planet at a time.",
      'rank': 128,
      'rating': 8,
      'release_date': '2009-04-06T00:00:00Z',
      'running_time_secs': 7620,
      'title': 'Star Trek',
      'year': 2009},
     'id': 'tt0796366',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '2357',
    '_index': 'movies',
    '_score': 11.051997,
    '_source': {'fields': {'actors': ["Dan O'Bannon",
       'Dre Pahich',
       'Brian Narelle'],
      'directors': ['John Carpenter'],
      'genres': ['Comedy', 'Sci-Fi'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMTUwODkwMzk1M15BMl5BanBnXkFtZTcwMjc4ODY3Mw@@._V1_SX400_.jpg',
      'plot': 'In the far reaches of space, a small crew, 20 years into their solitary mission, find things beginning to go hilariously wrong.',
      'rank': 2357,
      'rating': 6.4,
      'release_date': '1974-04-01T00:00:00Z',
      'running_time_secs': 4980,
      'title': 'Dark Star',
      'year': 1974},
     'id': 'tt0069945',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '1871',
    '_index': 'movies',
    '_score': 10.774647,
    '_source': {'fields': {'actors': ['Chris Cooper',
       'Elizabeth Peña',
       'Stephen Mendillo'],
      'directors': ['John Sayles'],
      'genres': ['Drama', 'Mystery', 'Romance'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMTU3OTI2OTk0N15BMl5BanBnXkFtZTcwMTU3OTYxMQ@@._V1_SX400_.jpg',
      'plot': 'When the skeleton of his murdered predecessor is found, Sheriff Sam Deeds unearths many other long-buried secrets in his Texas border town.',
      'rank': 1871,
      'rating': 7.5,
      'release_date': '1996-06-21T00:00:00Z',
      'running_time_secs': 8100,
      'title': 'Lone Star',
      'year': 1996},
     'id': 'tt0116905',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '2571',
    '_index': 'movies',
    '_score': 10.774647,
    '_source': {'fields': {'actors': ['Abbie Cornish',
       'Ben Whishaw',
       'Paul Schneider'],
      'directors': ['Jane Campion'],
      'genres': ['Biography', 'Drama', 'Romance'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMTg0NjEwNDgxNF5BMl5BanBnXkFtZTcwMjkyOTM3Mg@@._V1_SX400_.jpg',
      'plot': 'The three-year romance between 19th century poet John Keats and Fanny Brawne.',
      'rank': 2571,
      'rating': 6.9,
      'release_date': '2009-05-15T00:00:00Z',
      'running_time_secs': 7140,
      'title': 'Bright Star',
      'year': 2009},
     'id': 'tt0810784',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '1921',
    '_index': 'movies',
    '_score': 9.731573,
    '_source': {'fields': {'actors': ['Patrick Stewart',
       'Jonathan Frakes',
       'Brent Spiner'],
      'directors': ['Stuart Baird'],
      'genres': ['Action', 'Adventure', 'Sci-Fi', 'Thriller'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMjAxNjY2NDY3NF5BMl5BanBnXkFtZTcwMjA0MTEzMw@@._V1_SX400_.jpg',
      'plot': 'After the Enterprise is diverted to the Romulan planet of Romulus, supposedly because they want to negotiate a truce, the Federation soon find out the Romulans are planning an attack on Earth.',
      'rank': 1921,
      'rating': 6.3,
      'release_date': '2002-12-09T00:00:00Z',
      'running_time_secs': 6960,
      'title': 'Star Trek: Nemesis',
      'year': 2002},
     'id': 'tt0253754',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '2041',
    '_index': 'movies',
    '_score': 9.731573,
    '_source': {'fields': {'actors': ['Patrick Stewart',
       'William Shatner',
       'Malcolm McDowell'],
      'directors': ['David Carson'],
      'genres': ['Action', 'Adventure', 'Sci-Fi', 'Thriller'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BOTMyODkyODk1MV5BMl5BanBnXkFtZTcwNjk5MzI4OA@@._V1_SX400_.jpg',
      'plot': 'Captain Picard, with the help of supposedly dead Captain Kirk, must stop a madman willing to murder on a planetary scale in order to enter a space matrix.',
      'rank': 2041,
      'rating': 6.5,
      'release_date': '1994-11-17T00:00:00Z',
      'running_time_secs': 7080,
      'title': 'Star Trek: Generations',
      'year': 1994},
     'id': 'tt0111280',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '2236',
    '_index': 'movies',
    '_score': 9.522415,
    '_source': {'fields': {'actors': ['Patrick Stewart',
       'Jonathan Frakes',
       'Brent Spiner'],
      'directors': ['Jonathan Frakes'],
      'genres': ['Action', 'Adventure', 'Sci-Fi', 'Thriller'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMjA3NDI5MzQ1OF5BMl5BanBnXkFtZTcwMzcxNDI4OA@@._V1_SX400_.jpg',
      'plot': 'When the crew of the Enterprise learn of a Federation plot against the inhabitants of a unique planet, Captain Picard begins an open rebellion.',
      'rank': 2236,
      'rating': 6.3,
      'release_date': '1998-12-11T00:00:00Z',
      'running_time_secs': 6180,
      'title': 'Star Trek: Insurrection',
      'year': 1998},
     'id': 'tt0120844',
     'type': 'add'},
    '_type': 'movie'},
   {'_id': '3277',
    '_index': 'movies',
    '_score': 9.429694,
    '_source': {'fields': {'actors': ['Ziyi Zhang', 'Leehom Wang', 'Ruby Lin'],
      'directors': ['Dennie Gordon'],
      'genres': ['Adventure', 'Comedy'],
      'image_url': 'http://ia.media-imdb.com/images/M/MV5BMTQ0MzE3Mjk4M15BMl5BanBnXkFtZTgwNTMzMTEyMDE@._V1_SX400_.jpg',
      'plot': 'A woman gets caught up in an international diamond heist that draws her near to a spy trying to save the world.',
      'rank': 3277,
      'rating': 6.6,
      'release_date': '2013-09-17T00:00:00Z',
      'running_time_secs': 6840,
      'title': 'My Lucky Star',
      'year': 2013},
     'id': 'tt2102502',
     'type': 'add'},
    '_type': 'movie'}],
  'max_score': 14.557282,
  'total': {'relation': 'eq', 'value': 25}},
 'timed_out': False,
 'took': 30}

Filtrer ses query avec `filter`¶

On cherche ici les recettes avec un ingrédient de type parmesan sans ingrédient tuna en filtrant les recettes avec un temps de préparation inférieur ou egale à 15minutes.

In [13]:

Copied!





es.search(index="receipe", body={
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "ingredients.name": "parmesan"
          }
        }
      ], 
      "must_not": [
        {
          "match": {
            "ingredients.name": "tuna"
          }
        }
      ], 
      "filter": [
        {
          "range":{
            "preparation_time_minutes": {
              "lte":15
            }
          }
        }
        ]
    }
  }
})
es.search(index="receipe", body={
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "ingredients.name": "parmesan"
          }
        }
      ], 
      "must_not": [
        {
          "match": {
            "ingredients.name": "tuna"
          }
        }
      ], 
      "filter": [
        {
          "range":{
            "preparation_time_minutes": {
              "lte":15
            }
          }
        }
        ]
    }
  }
})

Out[13]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 1, 'total': 1},
 'hits': {'hits': [{'_id': '1',
    '_index': 'receipe',
    '_score': 1.379573,
    '_source': {'created': '2017/03/29',
     'description': "Cherry tomatoes are almost always sweeter, riper, and higher in pectin than larger tomatoes at the supermarket. All of these factors mean that cherry tomatoes are fantastic for making a rich, thick, flavorful sauce. Even better: It takes only four ingredients and about 10 minutes, start to finish—less time than it takes to cook the pasta you're gonna serve it with.",
     'ingredients': [{'name': 'Dry pasta', 'quantity': '450g'},
      {'name': 'Kosher salt'},
      {'name': 'Cloves garlic', 'quantity': '4'},
      {'name': 'Extra-virgin olive oil', 'quantity': '90ml'},
      {'name': 'Cherry tomatoes', 'quantity': '750g'},
      {'name': 'Fresh basil leaves', 'quantity': '30g'},
      {'name': 'Freshly ground black pepper'},
      {'name': 'Parmesan cheese'}],
     'preparation_time_minutes': 12,
     'ratings': [4.5, 5.0, 3.0, 4.5],
     'servings': {'max': 6, 'min': 4},
     'steps': ['Place pasta in a large skillet or sauté pan and cover with water and a big pinch of salt. Bring to a boil over high heat, stirring occasionally. Boil until just shy of al dente, about 1 minute less than the package instructions recommend.',
      'Meanwhile, heat garlic and 4 tablespoons (60ml) olive oil in a 12-inch skillet over medium heat, stirring frequently, until garlic is softened but not browned, about 3 minutes. Add tomatoes and cook, stirring, until tomatoes begin to burst. You can help them along by pressing on them with the back of a wooden spoon as they soften.',
      'Continue to cook until sauce is rich and creamy, about 5 minutes longer. Stir in basil and season to taste with salt and pepper.',
      'When pasta is cooked, drain, reserving 1 cup of pasta water. Add pasta to sauce and increase heat to medium-high. Cook, stirring and tossing constantly and adding reserved pasta water as necessary to adjust consistency to a nice, creamy flow. Remove from heat, stir in remaining 2 tablespoons (30ml) olive oil, and grate in a generous shower of Parmesan cheese. Serve immediately, passing extra Parmesan at the table.'],
     'title': 'Fast and Easy Pasta With Blistered Cherry Tomato Sauce'},
    '_type': '_doc'},
   {'_id': '10',
    '_index': 'receipe',
    '_score': 1.2786832,
    '_source': {'created': '2017/04/27',
     'description': 'Exceedingly simple in concept and execution, arrabbiata sauce is tomato sauce with the distinction of being spicy enough to earn its "angry" moniker. Here\'s how to make it, from start to finish.',
     'ingredients': [{'name': 'Kosher salt'},
      {'name': 'Penne pasta', 'quantity': '450g'},
      {'name': 'Extra-virgin olive oil', 'quantity': '3 tablespoons'},
      {'name': 'Clove garlic', 'quantity': '1'},
      {'name': 'Crushed red pepper'},
      {'name': 'Can whole peeled tomatoes', 'quantity': '400g'},
      {'name': 'Finely grated Parmesan cheese', 'quantity': '60g'},
      {'name': 'Minced flat-leaf parsley leaves',
       'quantity': 'Small handful'}],
     'preparation_time_minutes': 15,
     'ratings': [1.5, 2.0, 4.0, 3.5, 3.0, 5.0, 1.5],
     'servings': {'max': 4, 'min': 4},
     'steps': ['In a medium saucepan of boiling salted water, cook penne until just short of al dente, about 1 minute less than the package recommends.',
      'Meanwhile, in a large skillet, combine oil, garlic, and pepper flakes. Cook over medium heat until garlic is very lightly golden, about 5 minutes. (Adjust heat as necessary to keep it gently sizzling.)',
      'Add tomatoes, stir to combine, and bring to a bare simmer. When pasta is ready, transfer it to sauce using a strainer or slotted spoon. (Alternatively, drain pasta through a colander, reserving 1 cup of cooking water. Add drained pasta to sauce.)',
      'Add about 1/4 cup pasta water to sauce and increase heat to bring pasta and sauce to a vigorous simmer. Cook, stirring and shaking the pan and adding more pasta water as necessary to keep sauce loose, until pasta is perfectly al dente, 1 to 2 minutes longer. (The pasta will cook more slowly in the sauce than it did in the water.)',
      'Continue cooking pasta until sauce thickens and begins to coat noodles, then remove from heat and toss in cheese and parsley, stirring vigorously to incorporate. Stir in a drizzle of fresh olive oil, if desired. Season with salt and serve right away, passing more cheese at the table.'],
     'title': 'Penne With Hot-As-You-Dare Arrabbiata Sauce'},
    '_type': '_doc'}],
  'max_score': 1.379573,
  'total': {'relation': 'eq', 'value': 2}},
 'timed_out': False,
 'took': 21}

Recherche avec un prefix¶

Les query de type prefix permettent de trouver tout les termes commencant par le(s) caractère(s) correspondant.

In [14]:

Copied!

es.search(index="cities", body={"query": {"prefix" : { "city" : "l" }}})
es.search(index="cities", body={"query": {"prefix" : { "city" : "l" }}})

Out[14]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 1, 'total': 1},
 'hits': {'hits': [{'_id': '2',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'London', 'country': 'England'},
    '_type': 'places'},
   {'_id': '3',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'Los Angeles', 'country': 'America Bitch'},
    '_type': 'places'}],
  'max_score': 1.0,
  'total': {'relation': 'eq', 'value': 2}},
 'timed_out': False,
 'took': 11}

Rechercher avec des regex¶

In [15]:

Copied!

#tout afficher 
es.search(index="cities", body={"query": {"regexp" : { "city" : ".*" }}})
#tout afficher 
es.search(index="cities", body={"query": {"regexp" : { "city" : ".*" }}})

Out[15]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 1, 'total': 1},
 'hits': {'hits': [{'_id': '1',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'New Delhi', 'country': 'India'},
    '_type': 'places'},
   {'_id': '2',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'London', 'country': 'England'},
    '_type': 'places'},
   {'_id': '3',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'Los Angeles', 'country': 'America Bitch'},
    '_type': 'places'}],
  'max_score': 1.0,
  'total': {'relation': 'eq', 'value': 3}},
 'timed_out': False,
 'took': 4}

In [16]:

Copied!

#afficher les cities qui commencent par L
es.search(index="cities", body={"query": {"regexp" : { "city" : "l.*" }}})
#afficher les cities qui commencent par L
es.search(index="cities", body={"query": {"regexp" : { "city" : "l.*" }}})

Out[16]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 1, 'total': 1},
 'hits': {'hits': [{'_id': '2',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'London', 'country': 'England'},
    '_type': 'places'},
   {'_id': '3',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'Los Angeles', 'country': 'America Bitch'},
    '_type': 'places'}],
  'max_score': 1.0,
  'total': {'relation': 'eq', 'value': 2}},
 'timed_out': False,
 'took': 15}

In [17]:

Copied!

#afficher les cities qui commencent par L et terminent par n 
es.search(index="cities", body={"query": {"regexp" : { "city" : "l.*n" }}})
#afficher les cities qui commencent par L et terminent par n 
es.search(index="cities", body={"query": {"regexp" : { "city" : "l.*n" }}})

Out[17]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 1, 'total': 1},
 'hits': {'hits': [{'_id': '2',
    '_index': 'cities',
    '_score': 1.0,
    '_source': {'city': 'London', 'country': 'England'},
    '_type': 'places'}],
  'max_score': 1.0,
  'total': {'relation': 'eq', 'value': 1}},
 'timed_out': False,
 'took': 50}

Agregation¶

In [18]:

Copied!





#agregation simple -> movies/years
res = es.search(index="movies",body={"aggs" : {
    "nb_par_annee" : {
        "terms" : {"field" : "fields.year"}
}}})
res['aggregations']
#agregation simple -> movies/years
res = es.search(index="movies",body={"aggs" : {
    "nb_par_annee" : {
        "terms" : {"field" : "fields.year"}
}}})
res['aggregations']

Out[18]:

{'nb_par_annee': {'buckets': [{'doc_count': 448, 'key': 2013},
   {'doc_count': 404, 'key': 2012},
   {'doc_count': 308, 'key': 2011},
   {'doc_count': 253, 'key': 2009},
   {'doc_count': 249, 'key': 2010},
   {'doc_count': 207, 'key': 2008},
   {'doc_count': 204, 'key': 2006},
   {'doc_count': 200, 'key': 2007},
   {'doc_count': 170, 'key': 2005},
   {'doc_count': 152, 'key': 2014}],
  'doc_count_error_upper_bound': 52,
  'sum_other_doc_count': 2192}}

In [19]:

Copied!





#agregation et stats simple -> moyennes des raitings 
res = es.search(index="movies",body={"aggs" : {
    "note_moyenne" : {
        "avg" : {"field" : "fields.rating"}
}}})
res['aggregations']
#agregation et stats simple -> moyennes des raitings 
res = es.search(index="movies",body={"aggs" : {
    "note_moyenne" : {
        "avg" : {"field" : "fields.rating"}
}}})
res['aggregations']

Out[19]:

{'note_moyenne': {'value': 6.387107691895831}}

In [20]:

Copied!





#agregation et stats simple -> stats basiques raitings/years
res = es.search(index="movies",body={"aggs" : {
    "group_year" : {
        "terms" : { "field" : "fields.year" },
        "aggs" : {
            "note_moyenne" : {"avg" : {"field" : "fields.rating"}},
            "note_min" : {"min" : {"field" : "fields.rating"}},
            "note_max" : {"max" : {"field" : "fields.rating"}}
        }
}}})
res["aggregations"]
#agregation et stats simple -> stats basiques raitings/years
res = es.search(index="movies",body={"aggs" : {
    "group_year" : {
        "terms" : { "field" : "fields.year" },
        "aggs" : {
            "note_moyenne" : {"avg" : {"field" : "fields.rating"}},
            "note_min" : {"min" : {"field" : "fields.rating"}},
            "note_max" : {"max" : {"field" : "fields.rating"}}
        }
}}})
res["aggregations"]

Out[20]:

{'group_year': {'buckets': [{'doc_count': 448,
    'key': 2013,
    'note_max': {'value': 8.699999809265137},
    'note_min': {'value': 2.5},
    'note_moyenne': {'value': 5.962700002789497}},
   {'doc_count': 404,
    'key': 2012,
    'note_max': {'value': 8.600000381469727},
    'note_min': {'value': 2.4000000953674316},
    'note_moyenne': {'value': 5.961786593160322}},
   {'doc_count': 308,
    'key': 2011,
    'note_max': {'value': 8.5},
    'note_min': {'value': 1.7000000476837158},
    'note_moyenne': {'value': 6.114285714440531}},
   {'doc_count': 253,
    'key': 2009,
    'note_max': {'value': 8.399999618530273},
    'note_min': {'value': 2.700000047683716},
    'note_moyenne': {'value': 6.268774692248921}},
   {'doc_count': 249,
    'key': 2010,
    'note_max': {'value': 8.800000190734863},
    'note_min': {'value': 1.7999999523162842},
    'note_moyenne': {'value': 6.239759046868627}},
   {'doc_count': 207,
    'key': 2008,
    'note_max': {'value': 9.0},
    'note_min': {'value': 1.7999999523162842},
    'note_moyenne': {'value': 6.230917865527425}},
   {'doc_count': 204,
    'key': 2006,
    'note_max': {'value': 8.5},
    'note_min': {'value': 1.7999999523162842},
    'note_moyenne': {'value': 6.31617646708208}},
   {'doc_count': 200,
    'key': 2007,
    'note_max': {'value': 8.300000190734863},
    'note_min': {'value': 2.200000047683716},
    'note_moyenne': {'value': 6.419499988555908}},
   {'doc_count': 170,
    'key': 2005,
    'note_max': {'value': 8.300000190734863},
    'note_min': {'value': 2.299999952316284},
    'note_moyenne': {'value': 6.289999998317045}},
   {'doc_count': 152,
    'key': 2014,
    'note_max': {'value': 4.860000133514404},
    'note_min': {'value': 4.860000133514404},
    'note_moyenne': {'value': 4.860000133514404}}],
  'doc_count_error_upper_bound': 52,
  'sum_other_doc_count': 2192}}

📟 Exercice [optionnel]¶

Tester d'autres requetes

Datetime agrégation¶

Pour illuster l'agrégation par datetime on va créer un index travel et utiliser des data de type :

doc1 = {"city":"Bangalore", "country":"India","datetime": datetime.datetime(2018,1,1,10,20,0)}

In [21]:

Copied!





#specify mapping and create index 
if es.indices.exists(index="travel"):
    es.indices.delete(index="travel", ignore=[400,404])

settings = {
    "settings": {
        "number_of_shards": 2,
        "number_of_replicas": 1
    },
    "mappings": {
            "properties": {
                "city": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "country": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "datetime": {
                        "type": "date",
                    }
        }
     }
}
es.indices.create(index="travel", ignore=400, body=settings)
#specify mapping and create index 
if es.indices.exists(index="travel"):
    es.indices.delete(index="travel", ignore=[400,404])

settings = {
    "settings": {
        "number_of_shards": 2,
        "number_of_replicas": 1
    },
    "mappings": {
            "properties": {
                "city": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "country": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "datetime": {
                        "type": "date",
                    }
        }
     }
}
es.indices.create(index="travel", ignore=400, body=settings)

Out[21]:

{'acknowledged': True, 'index': 'travel', 'shards_acknowledged': True}

In [22]:

Copied!





import datetime
doc1 = {"city":"Bangalore", "country":"India","datetime": datetime.datetime(2018,1,1,10,20,0)} #datetime format: yyyy,MM,dd,hh,mm,ss
doc2 = {"city":"London", "country":"England","datetime": datetime.datetime(2018,1,2,22,30,0)}
doc3 = {"city":"Los Angeles", "country":"USA","datetime": datetime.datetime(2018,4,19,18,20,0)}
es.index(index="travel", id=1, body=doc1)
es.index(index="travel", id=2, body=doc2)
es.index(index="travel", id=3, body=doc3)
import datetime
doc1 = {"city":"Bangalore", "country":"India","datetime": datetime.datetime(2018,1,1,10,20,0)} #datetime format: yyyy,MM,dd,hh,mm,ss
doc2 = {"city":"London", "country":"England","datetime": datetime.datetime(2018,1,2,22,30,0)}
doc3 = {"city":"Los Angeles", "country":"USA","datetime": datetime.datetime(2018,4,19,18,20,0)}
es.index(index="travel", id=1, body=doc1)
es.index(index="travel", id=2, body=doc2)
es.index(index="travel", id=3, body=doc3)

Out[22]:

{'_id': '3',
 '_index': 'travel',
 '_primary_term': 1,
 '_seq_no': 2,
 '_shards': {'failed': 0, 'successful': 1, 'total': 2},
 '_type': '_doc',
 '_version': 1,
 'result': 'created'}

In [23]:

Copied!

es.indices.get_mapping(index='travel')
es.indices.get_mapping(index='travel')

Out[23]:

{'travel': {'mappings': {'properties': {'city': {'fields': {'keyword': {'ignore_above': 256,
       'type': 'keyword'}},
     'type': 'text'},
    'country': {'fields': {'keyword': {'ignore_above': 256,
       'type': 'keyword'}},
     'type': 'text'},
    'datetime': {'type': 'date'}}}}}

In [24]:

Copied!

es.search(index="travel", body={"from": 0, "size": 0, "query": {"match_all": {}}, "aggs": {
                  "country": {
                      "date_histogram": {"field": "datetime", "calendar_interval": "year"}}}})
es.search(index="travel", body={"from": 0, "size": 0, "query": {"match_all": {}}, "aggs": {
                  "country": {
                      "date_histogram": {"field": "datetime", "calendar_interval": "year"}}}})

Out[24]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 2, 'total': 2},
 'aggregations': {'country': {'buckets': []}},
 'hits': {'hits': [],
  'max_score': None,
  'total': {'relation': 'eq', 'value': 0}},
 'timed_out': False,
 'took': 8}

📟 Exercice [optionnel]¶

Créer le document suivant et inserer le en base afin de rafficher l'histogramme precedent, dite ce qui à changer.

doc4 = {"city":"Sydney", "country":"Australia","datetime":datetime.datetime(2019,4,19,18,20,0)}

In [25]:

Out[25]:

{'_id': '4',
 '_index': 'travel',
 '_primary_term': 1,
 '_seq_no': 0,
 '_shards': {'failed': 0, 'successful': 1, 'total': 2},
 '_type': '_doc',
 '_version': 1,
 'result': 'created'}

In [26]:

Out[26]:

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 2, 'total': 2},
 'aggregations': {'country': {'buckets': []}},
 'hits': {'hits': [],
  'max_score': None,
  'total': {'relation': 'eq', 'value': 0}},
 'timed_out': False,
 'took': 5}

Search text introduction : endpoint `_analyze`¶

Construire un Analyzer¶

Avant de commencer cette partie assurez vous d'avoir créer un french analyzer dans elasticsearch. Ci joint l'exemple d'analyzer francais vu dans le cour :

PUT french
{
  "settings": {
    "analysis": { 
      "filter": {
        "french_elision": {
          "type": "elision",
          "articles_case": true,
          "articles": ["l", "m", "t", "qu", "n", "s", "j", "d", "c", "jusqu", "quoiqu", "lorsqu", "puisqu"]
        },
        "french_synonym": {
          "type": "synonym",
          "ignore_case": true,
          "expand": true,
          "synonyms": [
            "réviser, étudier, bosser",
            "mayo, mayonnaise",
            "grille, toaste"
          ]
        },
        "french_stemmer": {
          "type": "stemmer",
          "language": "light_french"
        }
      },
      "analyzer": {
        "french_heavy": {
          "tokenizer": "icu_tokenizer",
          "filter": [
            "french_elision",
            "icu_folding",
            "french_synonym",
            "french_stemmer"
          ]
        },
        "french_light": {
          "tokenizer": "icu_tokenizer",
          "filter": [
            "french_elision",
            "icu_folding"
          ]
        }
      }
    }
  }
}

🤓 Assurer vous d'installer le pluging qui contient icu_tokenizer avant sinon vous allez avoir une erreur.

In [27]:

Copied!

doc1 = {"text" : "Une phrase en français :) ..."}
es.index(index="french", id=1, body=doc1)
doc1 = {"text" : "Une phrase en français :) ..."}
es.index(index="french", id=1, body=doc1)

Out[27]:

{'_id': '1',
 '_index': 'french',
 '_primary_term': 1,
 '_seq_no': 2,
 '_shards': {'failed': 0, 'successful': 1, 'total': 2},
 '_type': '_doc',
 '_version': 3,
 'result': 'updated'}

In [28]:

Copied!

es.indices.analyze(index="french",body={
  "text" : "Je dois bosser pour mon QCM sinon je vais avoir une sale note :( ..."
})
es.indices.analyze(index="french",body={
  "text" : "Je dois bosser pour mon QCM sinon je vais avoir une sale note :( ..."
})

Out[28]:

{'tokens': [{'end_offset': 2,
   'position': 0,
   'start_offset': 0,
   'token': 'je',
   'type': '<ALPHANUM>'},
  {'end_offset': 7,
   'position': 1,
   'start_offset': 3,
   'token': 'dois',
   'type': '<ALPHANUM>'},
  {'end_offset': 14,
   'position': 2,
   'start_offset': 8,
   'token': 'bosser',
   'type': '<ALPHANUM>'},
  {'end_offset': 19,
   'position': 3,
   'start_offset': 15,
   'token': 'pour',
   'type': '<ALPHANUM>'},
  {'end_offset': 23,
   'position': 4,
   'start_offset': 20,
   'token': 'mon',
   'type': '<ALPHANUM>'},
  {'end_offset': 27,
   'position': 5,
   'start_offset': 24,
   'token': 'qcm',
   'type': '<ALPHANUM>'},
  {'end_offset': 33,
   'position': 6,
   'start_offset': 28,
   'token': 'sinon',
   'type': '<ALPHANUM>'},
  {'end_offset': 36,
   'position': 7,
   'start_offset': 34,
   'token': 'je',
   'type': '<ALPHANUM>'},
  {'end_offset': 41,
   'position': 8,
   'start_offset': 37,
   'token': 'vais',
   'type': '<ALPHANUM>'},
  {'end_offset': 47,
   'position': 9,
   'start_offset': 42,
   'token': 'avoir',
   'type': '<ALPHANUM>'},
  {'end_offset': 51,
   'position': 10,
   'start_offset': 48,
   'token': 'une',
   'type': '<ALPHANUM>'},
  {'end_offset': 56,
   'position': 11,
   'start_offset': 52,
   'token': 'sale',
   'type': '<ALPHANUM>'},
  {'end_offset': 61,
   'position': 12,
   'start_offset': 57,
   'token': 'note',
   'type': '<ALPHANUM>'}]}

📟 Exercice [optionnel]¶

Ajouter une fonctionnalités de reconnaissance de smiley à votre analyzer, de sorte qu'il fasse le lien suivant :

:) -> _content_
:( -> _triste_

Faite ensuite une requete en python sur le document ci-dessous :

{
     "text" : "Je dois bosser pour mon QCM sinon je vais avoir une sale note :( ..."
}

In [29]:

Out[29]:

{'tokens': [{'end_offset': 2,
   'position': 0,
   'start_offset': 0,
   'token': 'je',
   'type': '<ALPHANUM>'},
  {'end_offset': 7,
   'position': 1,
   'start_offset': 3,
   'token': 'dois',
   'type': '<ALPHANUM>'},
  {'end_offset': 14,
   'position': 2,
   'start_offset': 8,
   'token': 'bosser',
   'type': '<ALPHANUM>'},
  {'end_offset': 19,
   'position': 3,
   'start_offset': 15,
   'token': 'pour',
   'type': '<ALPHANUM>'},
  {'end_offset': 23,
   'position': 4,
   'start_offset': 20,
   'token': 'mon',
   'type': '<ALPHANUM>'},
  {'end_offset': 27,
   'position': 5,
   'start_offset': 24,
   'token': 'qcm',
   'type': '<ALPHANUM>'},
  {'end_offset': 33,
   'position': 6,
   'start_offset': 28,
   'token': 'sinon',
   'type': '<ALPHANUM>'},
  {'end_offset': 36,
   'position': 7,
   'start_offset': 34,
   'token': 'je',
   'type': '<ALPHANUM>'},
  {'end_offset': 41,
   'position': 8,
   'start_offset': 37,
   'token': 'vais',
   'type': '<ALPHANUM>'},
  {'end_offset': 47,
   'position': 9,
   'start_offset': 42,
   'token': 'avoir',
   'type': '<ALPHANUM>'},
  {'end_offset': 51,
   'position': 10,
   'start_offset': 48,
   'token': 'une',
   'type': '<ALPHANUM>'},
  {'end_offset': 56,
   'position': 11,
   'start_offset': 52,
   'token': 'sale',
   'type': '<ALPHANUM>'},
  {'end_offset': 61,
   'position': 12,
   'start_offset': 57,
   'token': 'note',
   'type': '<ALPHANUM>'},
  {'end_offset': 64,
   'position': 13,
   'start_offset': 62,
   'token': '_triste_',
   'type': '<ALPHANUM>'}]}

In [ ]:

Elasticsearch python API overview¶

Avant de commencer¶

Lancer elasticsearch avec docker¶

Lancer elasticsearch avec docker-compose¶

🚧Attention à votre configuration Docker 🚧¶

📟 Exercice [optionnel]¶

Ping du container¶

Create, delete and verify index¶

Insert documents¶

📟 Exercice [optionnel]¶

Retrieve data with id : get¶

📟 Exercice [optionnel]¶

Mapping¶

Le endpoint _search et les query¶

📟 Exercice [optionnel]¶

Affiner ces critères de recherche avec _source¶

Logique booléenne¶

Les critères : SHOULD / MUST¶

Filtrer ses query avec filter¶

Recherche avec un prefix¶

Rechercher avec des regex¶

Agregation¶

📟 Exercice [optionnel]¶

Datetime agrégation¶

📟 Exercice [optionnel]¶

Search text introduction : endpoint _analyze¶

Construire un Analyzer¶

📟 Exercice [optionnel]¶

Retrieve data with id : `get`¶

Le endpoint `_search` et les `query`¶

Affiner ces critères de recherche avec `_source`¶

Filtrer ses query avec `filter`¶

Search text introduction : endpoint `_analyze`¶