I have created a API script with FastAPI to run a HF transformers’ NER model. However I am puzzled to see that the output returning from the API doesn’t match the output I get if run the model directly (the entities found are different). It is like the input text encoding is changed at some point, but can’t figure out where. My code:
main.py
from fastapi import FastAPI
from pydantic import BaseModel
from typing import List
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
app = FastAPI()
tokenizer1 = AutoTokenizer.from_pretrained("lcampillos/roberta-es-clinical-trials-ner")
model1 = AutoModelForTokenClassification.from_pretrained("lcampillos/roberta-es-clinical-trials-ner")
pipe1 = pipeline(task="ner", model=model1.to("cpu"), tokenizer=tokenizer1)
class PredictionInput(BaseModel):
text: str
class NERPrediction(BaseModel):
entity_group: str
score: float
word: str
start: int
end: int
class PredictionResult(BaseModel):
predictions: List[NERPrediction]
@app.post("/predict/", response_model=PredictionResult)
async def predict(input_data: PredictionInput):
# Perform NER inference
ner_predictions = perform_prediction(input_data.text)
# Prepare the response
response = PredictionResult(predictions=ner_predictions)
return response
def perform_prediction(input_text):
# Perform NER inference using the provided text
ner_results = pipe1(input_text)
# Create NERPrediction instances and populate the list
ner_predictions = []
for result in ner_results:
entity_group = result.get('entity_group') or result.get('entity')
if entity_group:
ner_prediction = NERPrediction(
entity_group=entity_group,
score=result.get('score', 0.0),
word=result.get('word', ''),
start=result.get('start', 0),
end=result.get('end', 0)
)
ner_predictions.append(ner_prediction)
return ner_predictions
a.py (code run to get response from API)
import requests
import urllib
text = "señor se presenta con fiebre y sudores fríos"
# Define the payload as a dictionary
url_encoded_text = urllib.parse.quote(text)
payload = {"text": url_encoded_text}
# Define the URL
url = "http://localhost:8000/predict/"
# Send the POST request with proper headers
headers = {"Content-Type": "application/json"}
response = requests.post(url, json=payload, headers=headers)
# Print the response
print(response.status_code)
print(response.json())
b.py (code run to check if the output is the same)
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
text = "señor se presenta con fiebre y sudores fríos"
tokenizer1 = AutoTokenizer.from_pretrained("lcampillos/roberta-es-clinical-trials-ner")
model1 = AutoModelForTokenClassification.from_pretrained("lcampillos/roberta-es-clinical-trials-ner")
pipe1 = pipeline(task="token-classification", model=model1.to("cpu"), binary_output=True, tokenizer=tokenizer1,
aggregation_strategy="average")
def perform_inference(txt):
ner_output = pipe1(txt)
return ner_output
print(perform_inference(text))
Output for a.py:
{'predictions': [{'entity_group': 'B-DISO', 'score': 0.9982337951660156, 'word': 'fiebre', 'start': 35, 'end': 41}, {'entity_group': 'B-DISO', 'score': 0.9964337348937988, 'word': 's', 'start': 48, 'end': 49}, {'entity_group': 'B-DISO', 'score': 0.9981486797332764, 'word': 'ud', 'start': 49, 'end': 51}, {'entity_group': 'B-DISO', 'score': 0.9970242381095886, 'word': 'ores', 'start': 51, 'end': 55}, {'entity_group': 'B-DISO', 'score': 0.9723742008209229, 'word': 'os', 'start': 66, 'end': 68}]}
Output for b.py:
[{'entity_group': 'DISO', 'score': 0.99900204, 'word': ' fiebre', 'start': 22, 'end': 28}, {'entity_group': 'DISO', 'score': 0.99870765, 'word': ' sudores fríos', 'start': 31, 'end': 44}]
I have tried to remove the URL-encoding and directly sending text as JSON to the API. This is indeed changes the output but still does not match the original output.
Many thanks in advance for your support.
2
Answers
Finally sorted by forwarding the raw text, as suggested by @Isabi, and rewriting the functions to:
In that way they work well with token-classification task.
I haven’t used NLP’s LLM, but the
pipe
is different:main.py
b.py
task
is different and the former lacks bothaggregation_strategy
andbinary_output
, which could lead to different results