Failed to create nodes

I try to populate a local neo4j database using the following python code:

```python
NEO4J_URI = "bolt://localhost:7687"
username = "neo4j"
password = "test_password"

import neo4j
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings

driver = neo4j.GraphDatabase.driver(NEO4J_URI, auth=(username, password))

basic_node_labels = ["Object", "Entity", "Group", "Person", "Organization", "Place"]
academic_node_labels = ["ArticleOrPaper", "PublicationOrJournal"]
climate_change_node_labels = ["GreenhouseGas", "TemperatureRise", "ClimateModel", "CarbonFootprint", "EnergySource"]

node_labels = basic_node_labels + academic_node_labels + climate_change_node_labels

rel_types = ["AFFECTS", "CAUSES", "ASSOCIATED_WITH", "DESCRIBES", "PREDICTS", "IMPACTS"]

prompt_template = '''
You are a climate researcher tasked with extracting information from research papers and structuring it in a property graph.

Extract the entities (nodes) and specify their type from the following text.
Also extract the relationships between these nodes.

Return the result as JSON using the following format:
{{"nodes": [ {{"idx": "0", "label": "entity type", "properties": {{"name": "entity name"}} }} ],
  "relationships": [{{"type": "RELATIONSHIP_TYPE", "start_node_id": "0", "end_node_id": "1", "properties": {{"details": "Relationship details"}} }}] }}

Input text:

{text}
'''


from neo4j_graphrag.experimental.components.text_splitters.fixed_size_splitter import FixedSizeSplitter
from neo4j_graphrag.experimental.pipeline.kg_builder import SimpleKGPipeline

from neo4j_graphrag.embeddings.ollama import OllamaEmbeddings
from neo4j_graphrag.llm import OllamaLLM

embedder = OllamaEmbeddings(model="mxbai-embed-large")
llm = OllamaLLM(model_name="llama3.2:3b", model_params={"temperature": 0.7})
kg_builder_pdf = SimpleKGPipeline(
    llm=llm,
    driver=driver,
    text_splitter=FixedSizeSplitter(chunk_size=500, chunk_overlap=100),
    embedder=embedder,
    entities=node_labels,
    relations=rel_types,
    prompt_template=prompt_template,
    from_pdf=True
)

pdf_file_paths = ['./data/pdf/ToxipediaGreenhouseEffectArchive.pdf',]

import asyncio
for path in pdf_file_paths:
    print(f"Processing: {path}")
    result = asyncio.run( kg_builder_pdf.run_async(file_path=path) )
    print(f"Result: {result}")
```

However, I noticed it started to create the warning message: `LLM response has improper format for chunk_index=` for every chunk. The final error message is like this:

`Received notification from DBMS server: {severity: WARNING} {code: Neo.ClientNotification.Statement.UnknownLabelWarning} {category: UNRECOGNIZED} {title: The provided label is not in the database.} {description: One of the labels in your query is not available in the database, make sure you didn't misspell it or that the label is available when you run this statement in your application (the missing label name is: __Entity__)} {position: line: 1, column: 15, offset: 14} for query: 'MATCH (entity:__Entity__)  RETURN count(entity) as c'
Result: run_id='66af88ce-afcc-47dc-9154-32a7299ddee0' result={'resolver': {'number_of_nodes_to_resolve': 0, 'number_of_created_nodes': None}}`

I also tried initializing `SimpleKGPipeline` like this:

```python
kg_builder_pdf = SimpleKGPipeline(
    llm=llm,
    driver=driver,
    text_splitter=FixedSizeSplitter(chunk_size=500, chunk_overlap=50),
    embedder=embedder,
    from_pdf=True
)
```

It produces the same error:

```text
Received notification from DBMS server: {severity: WARNING} {code: Neo.ClientNotification.Statement.UnknownLabelWarning} {category: UNRECOGNIZED} {title: The provided label is not in the database.} {description: One of the labels in your query is not available in the database, make sure you didn't misspell it or that the label is available when you run this statement in your application (the missing label name is: __Entity__)} {position: line: 1, column: 15, offset: 14} for query: 'MATCH (entity:__Entity__)  RETURN count(entity) as c'
Result: run_id='af156a08-dc80-4af7-bfd5-3cce563006a4' result={'resolver': {'number_of_nodes_to_resolve': 0, 'number_of_created_nodes': None}}
```


I feel like this problem is caused by the LLM I used to extract the graph relationship. Any ideas on how to fix this issue? Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Failed to create nodes #291

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Failed to create nodes #291

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions