===============
Creating Projects and adding rows
===============

Creating the project
~~~~~~~~~~~~~~~
In order to create a project, we need a client instance

.. code-block:: python

  from caplena import Client, resources
  client = Client(api_key="YOUR_API_KEY")


Next, we'll build the project's columns which defines the schema of the rows to be added.

.. code-block:: python

  from caplena.models.projects import (
      NonTTAColumnDefinition,
      NonTTAColumnType,
      TTAColumnDefinition,
      TTAColumnType,
   )

  columns=[
      NonTTAColumnDefinition(
          ref="id",  # ref is a unique identifier for the column in the project
          name="Survey Response ID", # name is what is shown in the User Interface
          type=NonTTAColumnType.numerical,
      ),
      TTAColumnDefinition(
          ref="nps_why",
          name="Why did you give this rating?",
          type=TTAColumnType.text_to_analyze,
          description="Please explain the rating in a few sentences.",
          topics=[],
      ),
  ]


Now we're ready to create the project:

.. code-block:: python

  from caplena.models.projects import ProjectLanguage, ProjectSettings

  project_settings = ProjectSettings(
      name="NPS Study",
      language=ProjectLanguage.EN,
      columns=columns,
      tags=["NPS"],
  ).model_dump(exclude_none=True)

  new_project = client.projects.create(**project_settings)

Optionally, we can pass :code:`translation_engine=google_translate` to translate rows automatically using Google Translate.

The newly created `new_project` has a generated unique identifier :code:`new_project.id`. The schema can be inspected using
:code:`new_project.columns`.


Appending rows
~~~~~~~~~~~~~~~

We can now proceed to add rows to the project. We can add a maximum of 20 rows per request, so we need to batch our data:

In this example, we'll generate some fake rows. For your application you may for example read from your database, another API or a CSV.
The ordering of columns within a row does not matter as columns are referenced using the *ref*

.. code-block:: python

  from caplena.models.projects import (
     MultipleRowPayload,
     RowPayload,
     NonTTACell,
     TTACell,
   )

  # generate fake rows
  rows = MultipleRowPayload(
      rows=[
          RowPayload(
              columns=[
                  NonTTACell(ref="id", value=i),
                  TTACell(ref="nps_why", value=f"Row {i}", topics=[]),
              ]
          ) for i in range(100)
      ]
  ).model_dump()["rows"]

  # batch rows, we'll use numpy for this
  import numpy as np
  n_batches = np.ceil(len(rows)/20) # compute the number of batches needed
  row_batches = np.array_split(rows, n_batches) # do the batching

  new_rows = []
  for row_batch in row_batches:
      new_rows.append(new_project.append_rows(rows=list(row_batch))) # need to cast to list from ndarray


This process takes a while, to monitor the status you can use `task_id` property from the `RowsAppend` response and call `get_append_status`.

.. code-block:: python

  # Check append status one by one using their IDs:
  for append_task in new_rows:
      while new_project.get_append_status(task_id=append_task.task_id).status == 'in_progress':
          time.sleep(10)
  # OR
  # Check all append statuses form the project
  all_tasks = new_project.get_append_status()
  for task in all_tasks.tasks:
      if task['status'] == 'in_progress':
          # Do something when upload not ready
      elif task['status'] == 'failed':
          # Do something when task has failed
      elif task['status'] == 'timed_out':
          # Do something when task timed_out
      elif task['status'] == 'succeeded':
          # Do something when task succeeded

When all upload tasks will succeeded the data will be uploaded to Caplena and ready to be analyzed!