Retrieving data and results¶
In order to retrieve resources, we need a client instance
from caplena import Client, resources
client = Client(api_key="YOUR_API_KEY")
Retrieving projects¶
We can retrieve single projects using their ID:
project = client.projects.retrieve(id="pj_1234k")
Or find mulitple projects using filters:
from caplena.filters import ProjectsFilter as P
projects = client.projects.list(filter=P.tags("NPS"))
for project in projects:
print(project.id, "-", project.name)
Retrieving Upload status¶
Get status of all bulk upload tasks from the last 7 days:
statuses = client.projects.get_append_status(project_id=project.id)
Get status of one upload task
status = client.projects.get_append_status(project_id=project.id, task_id="18e5b9c4-3498-45e9-a1ac-77659fdd13e1")
Filtering examples¶
Find projects matching multiple tags:
projects = client.projects.list(filter=P.tags("NPS") & P.tags("test"))
Find projects matching at least one of two tags
projects = client.projects.list(filter=P.tags("NPS") | P.tags("test"))
Projects created at or after a given date
client.projects.list(filter=P.created(gte="2022-01-01T00:00:00"))
Projects last modified after given date matching a tag
client.projects.list(filter=P.last_modified(gte="2022-01-01T00:00:00") & P.tags("NPS"))
Retrieving topics¶
Topics discovered and refined in the UI can be retrieved directly from the ProjectDetail
project = client.projects.retrieve(id="pj_1234k")
for col in project.columns:
if col.type == 'text_to_analyze':
print(col.ref, ": ", col.topics)
If the topics array is empty, this means that no analysis has been performed yet.
Retrieving rows¶
Rows can be retrieved from the project instance by row id
row = project.retrieve_row(id="ro_1234k")
or using the client’s ProjectsController
row = client.projects.retrieve_row(p_id="pj_werk2", r_id="ro_1ek4d")
Listing rows works similarly to listing projects:
rows = project.list_rows()
for row in rows:
print(row.id)
Filtering rows¶
Filters allow you to fetch rows matching specific criteria:
from caplena.filters import RowsFilter as R
rows = project.list_rows(filter=R.created(gte="2022-01-01T00:00:00"))
Filter rows for column values. Again, we use the ref to reference columns
rows = project.list_rows(filter=R.Columns.numerical(ref='id', exact=1))
rows = project.list_rows(filter=R.Columns.text_to_analyze(ref='nps_why', source_language="de"))
Retrieving row values¶
Rows are fetched in batches. If we want to have all row values in an object in memory, we need to iterate through all rows:
records = []
rows = project.list_rows()
for row in rows:
ref_to_val = {col.ref: col.value for col in row.columns} # mapping from column ref to value
records.append(ref_to_val)
You can use the records
to for example populate a database or create a pandas Dataframe
import pandas as pd
df = pd.DataFrame(records)
Retrieving analysis results¶
Caplena adds results to columns of type text_to_analyze
during the analysis.
The main results are the topics
which contain the topics that matches the given topic.value
.
Each topic has a topic.label
and a topic.category
attribute. Topics with topic.sentiment_enabled=True
also have
the relevant sentiment in topic.sentiment
.
rows = project.list_rows()
for row in rows:
for col in row.columns:
if col.type == "text_to_analyze":
print(f"Results for value: {col.value}")
print(f" Overall sentiment: {col.sentiment_overall}")
print(" Topics: ")
for topic in col.topics:
print(f" Category {topic.category}, Label {topic.label} and Sentiment {topic.sentiment}")