Caching
Module cache_checker
CacheChecker
Checks for the presence of documents in a Document Store based on a specified field in each document's metadata.
If matching documents are found, they are returned as "hits". If not found in the cache, the items are returned as "misses".
Usage example
python
from haystack import Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.caching.cache_checker import CacheChecker
docstore = InMemoryDocumentStore()
documents = [
Document(content="doc1", meta={"url": "https://example.com/1"}),
Document(content="doc2", meta={"url": "https://example.com/2"}),
Document(content="doc3", meta={"url": "https://example.com/1"}),
Document(content="doc4", meta={"url": "https://example.com/2"}),
]
docstore.write_documents(documents)
checker = CacheChecker(docstore, cache_field="url")
results = checker.run(items=["https://example.com/1", "https://example.com/5"])
assert results == {"hits": [documents[0], documents[2]], "misses": ["https://example.com/5"]}
CacheChecker.__init__
Creates a CacheChecker component.
Arguments:
document_store: Document Store to check for the presence of specific documents.cache_field: Name of the document's metadata field to check for cache hits.
CacheChecker.to_dict
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
CacheChecker.from_dict
Deserializes the component from a dictionary.
Arguments:
data: Dictionary to deserialize from.
Returns:
Deserialized component.
CacheChecker.run
Checks if any document associated with the specified cache field is already present in the store.
Arguments:
items: Values to be checked against the cache field.
Returns:
A dictionary with two keys:
hits- Documents that matched with at least one of the items.misses- Items that were not present in any documents.