Skip to content
@guaran-ia

GuaranIA

GuaranIA: Integrating the Guarani Language in the Digital Sphere for the Inclusion of Rural and Vulnerable Populations

Popular repositories Loading

  1. corpus corpus Public

    The central repository with the main codebase for data ingestion, preprocessing, and model training pipelines.

    Jupyter Notebook

  2. guarascraper guarascraper Public

    Web scrapper for Guarani text available online

    Python

  3. fineweb2-exploration fineweb2-exploration Public

    Code used to explore the dataset fineweb2

    Jupyter Notebook

  4. hltk-exploration hltk-exploration Public

    Code used to explore the dataset HLTK

    Python

  5. guardrails guardrails Public

    Code that implements guard-rail features that identify and get rid off of inappropriate content

    Python

  6. existing-guarani-corpora existing-guarani-corpora Public

    Code used to explore the existing guarani corpora

    Python

Repositories

Showing 10 of 11 repositories

Top languages

Loading…

Most used topics

Loading…