Models

Pydantic models for OpenAIRE API entities and responses.

`DataSourceResponse = ApiResponse[DataSource]` `module-attribute`

Type alias for an API response containing a list of DataSource entities.

`OrganizationResponse = ApiResponse[Organization]` `module-attribute`

Type alias for an API response containing a list of Organization entities.

`ProjectResponse = ApiResponse[Project]` `module-attribute`

Type alias for an API response containing a list of Project entities.

`ResearchProductResponse = ApiResponse[ResearchProduct]` `module-attribute`

Type alias for an API response containing a list of ResearchProduct entities.

`ApiResponse`

Bases: BaseModel

Generic Pydantic model for standard OpenAIRE API list responses.

This model represents the common envelope structure for API responses that return a list of entities. It includes a header (metadata) and a results field containing the list of entities. It is generic over EntityType to allow specific entity types to be used in the results list.

Attributes:

Name	Type	Description
`header`	`Header`	A `Header` object containing metadata about the response.
`results`	`list[EntityType] \| None`	An optional list of entities of type `EntityType`. A validator ensures this field is a list or None, handling potential API inconsistencies gracefully.

Source code in src/aireloom/models/base.py

class ApiResponse[EntityType: "BaseEntity"](BaseModel):
    """Generic Pydantic model for standard OpenAIRE API list responses.

    This model represents the common envelope structure for API responses that
    return a list of entities. It includes a `header` (metadata) and a `results`
    field containing the list of entities. It is generic over `EntityType` to
    allow specific entity types to be used in the `results` list.

    Attributes:
        header: A `Header` object containing metadata about the response.
        results: An optional list of entities of type `EntityType`. A validator
                 ensures this field is a list or None, handling potential API
                 inconsistencies gracefully.
    """

    header: Header
    # Results can sometimes be null/absent, sometimes an empty list
    results: list[EntityType] | None = None

    @field_validator("results", mode="before")
    @classmethod
    def handle_null_results(cls, v: Any) -> list[EntityType] | None:
        """Ensure 'results' is a list or None.

        Handles potential None or unexpected formats from the API.
        Logs a warning and returns an empty list for unexpected types.
        """
        if v is None:
            return None  # Explicitly return None if API sends null
        if isinstance(v, list):
            return v  # Already a list

        # Handle unexpected formats (e.g., dict wrappers like {'result': [...]})
        # or other non-list types by logging and returning an empty list.
        logger.warning(
            f"Unexpected format for 'results' field: {type(v)}. "
            f"Expected list or None, got {v!r}. Returning empty list."
        )
        return []

    model_config = ConfigDict(extra="allow")

`handle_null_results(v)` `classmethod`

Ensure 'results' is a list or None.

Handles potential None or unexpected formats from the API. Logs a warning and returns an empty list for unexpected types.

Source code in src/aireloom/models/base.py

@field_validator("results", mode="before")
@classmethod
def handle_null_results(cls, v: Any) -> list[EntityType] | None:
    """Ensure 'results' is a list or None.

    Handles potential None or unexpected formats from the API.
    Logs a warning and returns an empty list for unexpected types.
    """
    if v is None:
        return None  # Explicitly return None if API sends null
    if isinstance(v, list):
        return v  # Already a list

    # Handle unexpected formats (e.g., dict wrappers like {'result': [...]})
    # or other non-list types by logging and returning an empty list.
    logger.warning(
        f"Unexpected format for 'results' field: {type(v)}. "
        f"Expected list or None, got {v!r}. Returning empty list."
    )
    return []

`BaseEntity`

Bases: BaseModel

A base Pydantic model for OpenAIRE entities (e.g., publication, project).

This model provides a common foundation for all specific entity types, primarily by ensuring an id field is present, which is a common identifier across most OpenAIRE entities. It allows extra fields from the API to be captured without causing validation errors.

Attributes:

Name	Type	Description
`id`	`str`	The unique identifier for the entity.

Source code in src/aireloom/models/base.py

class BaseEntity(BaseModel):
    """A base Pydantic model for OpenAIRE entities (e.g., publication, project).

    This model provides a common foundation for all specific entity types,
    primarily by ensuring an `id` field is present, which is a common
    identifier across most OpenAIRE entities. It allows extra fields from the
    API to be captured without causing validation errors.

    Attributes:
        id: The unique identifier for the entity.
    """

    # Common identifier across most entities
    id: str

    model_config = ConfigDict(extra="allow")

`ControlledField`

Bases: BaseModel

Represents a field with a controlled vocabulary, typically including a scheme and a value.

This model is used for structured data elements where the value has a specific meaning defined by an associated scheme (e.g., a PID like DOI, or a subject classification from a specific thesaurus).

Attributes:

Name	Type	Description
`scheme`	`str \| None`	The scheme or system defining the context of the value (e.g., "doi", "orcid", "mesh").
`value`	`str \| None`	The actual value from the controlled vocabulary.

Source code in src/aireloom/models/data_source.py

class ControlledField(BaseModel):
    """Represents a field with a controlled vocabulary, typically including a scheme and a value.

    This model is used for structured data elements where the value has a specific
    meaning defined by an associated scheme (e.g., a PID like DOI, or a subject
    classification from a specific thesaurus).

    Attributes:
        scheme: The scheme or system defining the context of the value (e.g., "doi", "orcid", "mesh").
        value: The actual value from the controlled vocabulary.
    """

    scheme: str | None = None
    value: str | None = None

    model_config = ConfigDict(extra="allow")

`Country`

Bases: BaseModel

Represents the country associated with an organization.

Attributes:

Name	Type	Description
`code`	`str \| None`	The ISO 3166-1 alpha-2 country code (e.g., "GR", "US").
`label`	`str \| None`	The human-readable name of the country (e.g., "Greece").

Source code in src/aireloom/models/organization.py

class Country(BaseModel):
    """Represents the country associated with an organization.

    Attributes:
        code: The ISO 3166-1 alpha-2 country code (e.g., "GR", "US").
        label: The human-readable name of the country (e.g., "Greece").
    """

    code: str | None = None
    label: str | None = None

    model_config = ConfigDict(extra="allow")

`DataSource`

Bases: BaseEntity

Model representing an OpenAIRE Data Source entity.

A data source in OpenAIRE can be a repository, journal, aggregator, etc. This model captures various metadata fields associated with a data source.

Source code in src/aireloom/models/data_source.py

class DataSource(BaseEntity):
    """Model representing an OpenAIRE Data Source entity.

    A data source in OpenAIRE can be a repository, journal, aggregator, etc.
    This model captures various metadata fields associated with a data source.
    """

    originalIds: list[str] | None = Field(default_factory=list)
    pids: list[ControlledField] | None = Field(default_factory=list)
    type: ControlledField | None = None
    openaireCompatibility: str | None = None
    officialName: str | None = None
    englishName: str | None = None
    websiteUrl: str | None = None
    logoUrl: str | None = None
    dateOfValidation: str | None = None
    description: str | None = None
    subjects: list[str] | None = Field(default_factory=list)
    languages: list[str] | None = Field(default_factory=list)
    contentTypes: list[str] | None = Field(default_factory=list)
    releaseStartDate: str | None = None
    releaseEndDate: str | None = None
    accessRights: AccessRightType | None = None
    uploadRights: AccessRightType | None = None
    databaseAccessRestriction: DatabaseRestrictionType | None = None
    dataUploadRestriction: str | None = None
    versioning: bool | None = None
    citationGuidelineUrl: str | None = None
    pidSystems: str | None = None
    certificates: str | None = None
    policies: list[str] | None = Field(default_factory=list)
    missionStatementUrl: str | None = None
    # Added based on documentation/analysis
    journal: Container | None = None

    model_config = ConfigDict(extra="allow")

`Funding`

Bases: BaseModel

Represents funding information for a project, including the source and stream.

Attributes:

Name	Type	Description
`fundingStream`	`FundingStream \| None`	A `FundingStream` object detailing the specific stream.
`jurisdiction`	`str \| None`	The jurisdiction associated with the funding (e.g., country code).
`name`	`str \| None`	The name of the funding body or organization.
`shortName`	`str \| None`	An optional short name or acronym for the funding body.

Source code in src/aireloom/models/project.py

class Funding(BaseModel):
    """Represents funding information for a project, including the source and stream.

    Attributes:
        fundingStream: A `FundingStream` object detailing the specific stream.
        jurisdiction: The jurisdiction associated with the funding (e.g., country code).
        name: The name of the funding body or organization.
        shortName: An optional short name or acronym for the funding body.
    """

    fundingStream: FundingStream | None = None
    jurisdiction: str | None = None
    name: str | None = None
    shortName: str | None = None
    model_config = ConfigDict(extra="allow")

`FundingStream`

Bases: BaseModel

Represents details about a specific funding stream for a project.

Attributes:

Name	Type	Description
`description`	`str \| None`	A description of the funding stream.
`id`	`str \| None`	The unique identifier of the funding stream.

Source code in src/aireloom/models/project.py

class FundingStream(BaseModel):
    """Represents details about a specific funding stream for a project.

    Attributes:
        description: A description of the funding stream.
        id: The unique identifier of the funding stream.
    """

    description: str | None = None
    id: str | None = None
    model_config = ConfigDict(extra="allow")

`Grant`

Bases: BaseModel

Represents details about the grant amounts associated with a project.

Attributes:

Name	Type	Description
`currency`	`str \| None`	The currency code for the amounts (e.g., "EUR", "USD").
`fundedAmount`	`float \| None`	The amount of funding awarded.
`totalCost`	`float \| None`	The total cost of the project.

Source code in src/aireloom/models/project.py

class Grant(BaseModel):
    """Represents details about the grant amounts associated with a project.

    Attributes:
        currency: The currency code for the amounts (e.g., "EUR", "USD").
        fundedAmount: The amount of funding awarded.
        totalCost: The total cost of the project.
    """

    currency: str | None = None
    fundedAmount: float | None = None
    totalCost: float | None = None
    model_config = ConfigDict(extra="allow")

`H2020Programme`

Bases: BaseModel

Represents details about an H2020 programme related to a project.

Attributes:

Name	Type	Description
`code`	`str \| None`	The code of the H2020 programme.
`description`	`str \| None`	A description of the H2020 programme.

Source code in src/aireloom/models/project.py

class H2020Programme(BaseModel):
    """Represents details about an H2020 programme related to a project.

    Attributes:
        code: The code of the H2020 programme.
        description: A description of the H2020 programme.
    """

    code: str | None = None
    description: str | None = None
    model_config = ConfigDict(extra="allow")

`Header`

Bases: BaseModel

Represents the 'header' section commonly found in OpenAIRE API responses.

This model captures metadata about the API response, such as status, query time, total number of results found (numFound), pagination details like nextCursor, and page size. It includes validators to coerce numeric fields that might be returned as strings by the API.

Attributes:

Name	Type	Description
`status`	`str \| None`	Optional status message from the API.
`code`	`str \| None`	Optional status code from the API.
`message`	`str \| None`	Optional descriptive message from the API.
`queryTime`	`int \| None`	Time taken by the API to process the query, in milliseconds.
`numFound`	`int \| None`	Total number of results found matching the query criteria.
`nextCursor`	`str \| HttpUrl \| None`	The cursor string to use for fetching the next page of results. Can be a string or an HttpUrl.
`pageSize`	`int \| None`	The number of results included in the current page.

Source code in src/aireloom/models/base.py

class Header(BaseModel):
    """Represents the 'header' section commonly found in OpenAIRE API responses.

    This model captures metadata about the API response, such as status,
    query time, total number of results found (`numFound`), pagination details
    like `nextCursor`, and page size. It includes validators to coerce
    numeric fields that might be returned as strings by the API.

    Attributes:
        status: Optional status message from the API.
        code: Optional status code from the API.
        message: Optional descriptive message from the API.
        queryTime: Time taken by the API to process the query, in milliseconds.
        numFound: Total number of results found matching the query criteria.
        nextCursor: The cursor string to use for fetching the next page of results.
                    Can be a string or an HttpUrl.
        pageSize: The number of results included in the current page.
    """

    # Note: status, code, message are typically expected, but optional for robustness.
    status: str | None = None
    code: str | None = None
    message: str | None = None
    # total and count are often strings in the API response, needs validation/coercion
    queryTime: int | None = None
    numFound: int | None = None  # next/prev can be full URLs or just the cursor string
    nextCursor: str | HttpUrl | None = Field(default=None)  # API returns "nextCursor"
    pageSize: int | None = None

    @field_validator("queryTime", "numFound", "pageSize", mode="before")
    @classmethod
    def coerce_str_to_int(cls, v: Any) -> int | None:
        """Coerce string representations of numbers to integers, logging on failure."""
        if isinstance(v, str):
            try:
                return int(v)
            except (ValueError, TypeError):
                logger.warning(f"Could not coerce header value '{v}' to int.")
                return None
        # Allow integers through if they somehow bypass 'before' validation or API changes
        if isinstance(v, int):
            return v
        # Handle other unexpected types if necessary
        logger.warning(f"Unexpected type {type(v)} for header numeric value '{v}'.")
        return None

    model_config = ConfigDict(extra="allow")

`coerce_str_to_int(v)` `classmethod`

Coerce string representations of numbers to integers, logging on failure.

Source code in src/aireloom/models/base.py

@field_validator("queryTime", "numFound", "pageSize", mode="before")
@classmethod
def coerce_str_to_int(cls, v: Any) -> int | None:
    """Coerce string representations of numbers to integers, logging on failure."""
    if isinstance(v, str):
        try:
            return int(v)
        except (ValueError, TypeError):
            logger.warning(f"Could not coerce header value '{v}' to int.")
            return None
    # Allow integers through if they somehow bypass 'before' validation or API changes
    if isinstance(v, int):
        return v
    # Handle other unexpected types if necessary
    logger.warning(f"Unexpected type {type(v)} for header numeric value '{v}'.")
    return None

`Organization`

Bases: BaseEntity

Model representing an OpenAIRE Organization entity.

Captures details about an organization, including its names, website, country, and various persistent identifiers. Inherits the id field from BaseEntity.

Attributes:

Name	Type	Description
`legalShortName`	`str \| None`	The official short name or acronym of the organization.
`legalName`	`str \| None`	The full official legal name of the organization.
`alternativeNames`	`list[str] \| None`	A list of other known names for the organization.
`websiteUrl`	`str \| None`	The URL of the organization's official website.
`country`	`Country \| None`	A `Country` object representing the organization's country.
`pids`	`list[OrganizationPid] \| None`	A list of `OrganizationPid` objects representing various PIDs associated with the organization.

Source code in src/aireloom/models/organization.py

class Organization(BaseEntity):
    """Model representing an OpenAIRE Organization entity.

    Captures details about an organization, including its names, website,
    country, and various persistent identifiers. Inherits the `id` field
    from `BaseEntity`.

    Attributes:
        legalShortName: The official short name or acronym of the organization.
        legalName: The full official legal name of the organization.
        alternativeNames: A list of other known names for the organization.
        websiteUrl: The URL of the organization's official website.
        country: A `Country` object representing the organization's country.
        pids: A list of `OrganizationPid` objects representing various PIDs
              associated with the organization.
    """

    # id is inherited from BaseEntity
    legalShortName: str | None = None
    legalName: str | None = None
    alternativeNames: list[str] | None = Field(default_factory=list)
    websiteUrl: str | None = None
    country: Country | None = None
    pids: list[OrganizationPid] | None = Field(default_factory=list)

    model_config = ConfigDict(extra="allow")

`OrganizationPid`

Bases: BaseModel

Represents a persistent identifier (PID) for an organization.

Attributes:

Name	Type	Description
`scheme`	`str \| None`	The scheme of the PID (e.g., "ror", "grid", "isni").
`value`	`str \| None`	The value of the PID.

Source code in src/aireloom/models/organization.py

class OrganizationPid(BaseModel):
    """Represents a persistent identifier (PID) for an organization.

    Attributes:
        scheme: The scheme of the PID (e.g., "ror", "grid", "isni").
        value: The value of the PID.
    """

    scheme: str | None = None
    value: str | None = None

    model_config = ConfigDict(extra="allow")

`Project`

Bases: BaseEntity

Model representing an OpenAIRE Project entity.

Captures comprehensive information about a research project, including its identifiers, title, funding, duration, and related metadata. Inherits the id field from BaseEntity.

Attributes:

Name	Type	Description
`code`	`str \| None`	The project code or grant number.
`acronym`	`str \| None`	The acronym of the project.
`title`	`str \| None`	The official title of the project.
`callIdentifier`	`str \| None`	Identifier for the funding call.
`fundings`	`list[Funding] \| None`	A list of `Funding` objects detailing the project's funding sources.
`granted`	`Grant \| None`	A `Grant` object with information about the awarded grant amounts.
`h2020Programmes`	`list[H2020Programme] \| None`	A list of `H2020Programme` objects if the project is part of H2020.
`keywords`	`list[str] \| str \| None`	A list of keywords or a single string of keywords describing the project. A validator attempts to parse comma or semicolon-separated strings.
`openAccessMandateForDataset`	`bool \| None`	Boolean indicating if there's an open access mandate for datasets produced by the project.
`openAccessMandateForPublications`	`bool \| None`	Boolean indicating if there's an open access mandate for publications from the project.
`startDate`	`str \| None`	The start date of the project (typically "YYYY-MM-DD" string).
`endDate`	`str \| None`	The end date of the project (typically "YYYY-MM-DD" string).
`subjects`	`list[str] \| None`	A list of subject classifications for the project.
`summary`	`str \| None`	A summary or abstract of the project.
`websiteUrl`	`str \| None`	The URL of the project's official website.

Source code in src/aireloom/models/project.py

class Project(BaseEntity):
    """Model representing an OpenAIRE Project entity.

    Captures comprehensive information about a research project, including its
    identifiers, title, funding, duration, and related metadata. Inherits the
    `id` field from `BaseEntity`.

    Attributes:
        code: The project code or grant number.
        acronym: The acronym of the project.
        title: The official title of the project.
        callIdentifier: Identifier for the funding call.
        fundings: A list of `Funding` objects detailing the project's funding sources.
        granted: A `Grant` object with information about the awarded grant amounts.
        h2020Programmes: A list of `H2020Programme` objects if the project is part of H2020.
        keywords: A list of keywords or a single string of keywords describing the project.
                  A validator attempts to parse comma or semicolon-separated strings.
        openAccessMandateForDataset: Boolean indicating if there's an open access
                                     mandate for datasets produced by the project.
        openAccessMandateForPublications: Boolean indicating if there's an open access
                                          mandate for publications from the project.
        startDate: The start date of the project (typically "YYYY-MM-DD" string).
        endDate: The end date of the project (typically "YYYY-MM-DD" string).
        subjects: A list of subject classifications for the project.
        summary: A summary or abstract of the project.
        websiteUrl: The URL of the project's official website.
    """

    # id is inherited from BaseEntity
    code: str | None = None
    acronym: str | None = None
    title: str | None = None
    callIdentifier: str | None = None
    fundings: list[Funding] | None = Field(default_factory=list)
    granted: Grant | None = None
    h2020Programmes: list[H2020Programme] | None = Field(default_factory=list)
    # Keywords might be a single string or a delimited string. Attempt parsing.
    keywords: list[str] | str | None = None
    openAccessMandateForDataset: bool | None = None
    openAccessMandateForPublications: bool | None = None
    # Dates are kept as string for safety due to potential missing parts or nulls.
    # Expected format is typically YYYY-MM-DD.
    startDate: str | None = None
    endDate: str | None = None
    subjects: list[str] | None = Field(default_factory=list)
    summary: str | None = None
    websiteUrl: str | None = None

    model_config = ConfigDict(extra="allow")

    @field_validator("keywords", mode="before")
    @classmethod
    def parse_keywords_string(cls, v: Any) -> list[str] | str | None:
        """Attempts to parse a keyword string into a list of strings.

        If the input `v` is a string, this validator tries to split it by common
        delimiters (comma, then semicolon). If splitting results in more than one
        part, a list of stripped parts is returned. Otherwise, the original string
        (or None if empty) is returned. If `v` is not a string (e.g., already a
        list or None), it's returned as is.

        Args:
            v: The value to parse, expected to be a string, list, or None.

        Returns:
            A list of strings if parsing was successful and yielded multiple keywords,
            the original string if no parsing occurred or yielded a single part,
            or None if the input string was empty.
        """
        if isinstance(v, str):
            # Prioritize comma, then semicolon
            delimiters = [",", ";"]
            for delimiter in delimiters:
                parts = [part.strip() for part in v.split(delimiter) if part.strip()]
                if len(parts) > 1:
                    return parts
            # If no split produced multiple parts, return the original string (or None if it was empty)
            return v if v else None
        # If not a string (e.g., already a list or None), return as is
        return v

`parse_keywords_string(v)` `classmethod`

Attempts to parse a keyword string into a list of strings.

If the input v is a string, this validator tries to split it by common delimiters (comma, then semicolon). If splitting results in more than one part, a list of stripped parts is returned. Otherwise, the original string (or None if empty) is returned. If v is not a string (e.g., already a list or None), it's returned as is.

Parameters:

Name	Type	Description	Default
`v`	`Any`	The value to parse, expected to be a string, list, or None.	required

Returns:

Type	Description
`list[str] \| str \| None`	A list of strings if parsing was successful and yielded multiple keywords,
`list[str] \| str \| None`	the original string if no parsing occurred or yielded a single part,
`list[str] \| str \| None`	or None if the input string was empty.

Source code in src/aireloom/models/project.py

@field_validator("keywords", mode="before")
@classmethod
def parse_keywords_string(cls, v: Any) -> list[str] | str | None:
    """Attempts to parse a keyword string into a list of strings.

    If the input `v` is a string, this validator tries to split it by common
    delimiters (comma, then semicolon). If splitting results in more than one
    part, a list of stripped parts is returned. Otherwise, the original string
    (or None if empty) is returned. If `v` is not a string (e.g., already a
    list or None), it's returned as is.

    Args:
        v: The value to parse, expected to be a string, list, or None.

    Returns:
        A list of strings if parsing was successful and yielded multiple keywords,
        the original string if no parsing occurred or yielded a single part,
        or None if the input string was empty.
    """
    if isinstance(v, str):
        # Prioritize comma, then semicolon
        delimiters = [",", ";"]
        for delimiter in delimiters:
            parts = [part.strip() for part in v.split(delimiter) if part.strip()]
            if len(parts) > 1:
                return parts
        # If no split produced multiple parts, return the original string (or None if it was empty)
        return v if v else None
    # If not a string (e.g., already a list or None), return as is
    return v

`ResearchProduct`

Bases: BaseEntity

Model representing an OpenAIRE Research Product entity.

This is a central model in OpenAIRE, representing various outputs of research such as publications, datasets, software, or other types. It aggregates numerous metadata fields. Inherits id from BaseEntity.

Attributes:

Name	Type	Description
`originalIds`	`list[str] \| None`	A list of original identifiers for the research product.
`pids`	`list[Pid] \| None`	A list of `Pid` objects representing persistent identifiers.
`type`	`ResearchProductType \| None`	The `ResearchProductType` (e.g., "publication", "dataset").
`title`	`str \| None`	The main title of the research product.
`authors`	`list[Author] \| None`	A list of `Author` objects.
`bestAccessRight`	`BestAccessRight \| None`	A `BestAccessRight` object indicating the determined access status.
`country`	`ResultCountry \| None`	A `ResultCountry` object indicating the country associated with the product.
`description`	`str \| None`	A textual description or abstract of the research product.
`publicationDate`	`str \| None`	The publication date of the research product (YYYY-MM-DD string).
`publisher`	`str \| None`	The name of the publisher.
`indicators`	`Indicator \| None`	An `Indicator` object containing citation and usage metrics.
`instances`	`list[Instance] \| None`	A list of `Instance` objects representing different manifestations or versions of the research product.
`language`	`Language \| None`	A `Language` object for the primary language of the product.
`subjects`	`list[Subject] \| None`	A list of `Subject` objects.
`container`	`Container \| None`	A `Container` object if the product is part of a larger collection (e.g., a journal for an article).
`geoLocation`	`GeoLocation \| None`	A `GeoLocation` object, typically for datasets.
`keywords`	`list[str] \| None`	A list of keywords. A validator attempts to parse comma-separated strings.
`journal`	`Container \| None`	An alias or alternative field for `container`, often used for journal details. (Note: API might use 'container' or 'journal' field for similar info).

Source code in src/aireloom/models/research_product.py

class ResearchProduct(BaseEntity):
    """Model representing an OpenAIRE Research Product entity.

    This is a central model in OpenAIRE, representing various outputs of research
    such as publications, datasets, software, or other types. It aggregates
    numerous metadata fields. Inherits `id` from `BaseEntity`.

    Attributes:
        originalIds: A list of original identifiers for the research product.
        pids: A list of `Pid` objects representing persistent identifiers.
        type: The `ResearchProductType` (e.g., "publication", "dataset").
        title: The main title of the research product.
        authors: A list of `Author` objects.
        bestAccessRight: A `BestAccessRight` object indicating the determined access status.
        country: A `ResultCountry` object indicating the country associated with the product.
        description: A textual description or abstract of the research product.
        publicationDate: The publication date of the research product (YYYY-MM-DD string).
        publisher: The name of the publisher.
        indicators: An `Indicator` object containing citation and usage metrics.
        instances: A list of `Instance` objects representing different manifestations
                   or versions of the research product.
        language: A `Language` object for the primary language of the product.
        subjects: A list of `Subject` objects.
        container: A `Container` object if the product is part of a larger collection
                   (e.g., a journal for an article).
        geoLocation: A `GeoLocation` object, typically for datasets.
        keywords: A list of keywords. A validator attempts to parse comma-separated strings.
        journal: An alias or alternative field for `container`, often used for journal details.
                 (Note: API might use 'container' or 'journal' field for similar info).
    """

    # id is inherited from BaseEntity
    originalIds: list[str] | None = Field(default_factory=list)
    pids: list[Pid] | None = Field(default_factory=list)
    type: ResearchProductType | None = None
    title: str | None = None
    authors: list[Author] | None = Field(default_factory=list)
    bestAccessRight: BestAccessRight | None = None
    country: ResultCountry | None = None
    description: str | None = None
    publicationDate: str | None = None
    publisher: str | None = None
    indicators: Indicator | None = None
    instances: list[Instance] | None = Field(default_factory=list)
    language: Language | None = None
    subjects: list[Subject] | None = Field(default_factory=list)
    container: Container | None = None
    geoLocation: GeoLocation | None = None
    keywords: list[str] | None = Field(default_factory=list)
    journal: Container | None = None

    model_config = ConfigDict(extra="allow", populate_by_name=True)

    @field_validator("keywords", mode="before")
    @classmethod
    def split_keywords(cls, v: Any) -> list[str] | None:
        """Attempts to split a comma-separated string of keywords into a list.

        If the input `v` is a string, it's split by commas, and each part is stripped
        of whitespace. If `v` is None or not a string, it's returned as is (or None
        if the string was empty after stripping).

        Args:
            v: The value to parse, expected to be a string or None.

        Returns:
            A list of keyword strings, or None if input was None or empty.
        """
        if v is None:
            return None
        if isinstance(v, str):
            return [kw.strip() for kw in v.split(",") if kw.strip()]
        logger.warning(
            f"Unexpected value for ResearchProduct.keywords: {v}. Expected string or None."
        )
        return None  # Or raise ValueError if strictness is preferred

    @model_validator(mode="before")
    @classmethod
    def get_title_from_main_title(cls, data: Any) -> Any:
        """Populates the `title` field from `mainTitle` if `title` is not present.

        The OpenAIRE API sometimes uses `mainTitle` for the primary title. This
        validator ensures that the `title` field in the Pydantic model is populated
        using `mainTitle` if `title` itself is missing in the input data, effectively
        aliasing `mainTitle` to `title`.

        Args:
            data: The raw input data dictionary before validation.

        Returns:
            The (potentially modified) input data dictionary.
        """
        if isinstance(data, dict) and "mainTitle" in data:
            if (
                "title" not in data or data["title"] is None
            ):  # Ensure we don't overwrite an existing title
                data["title"] = data.pop("mainTitle")
            else:  # title exists, no need to pop mainTitle if it's just a duplicate
                data.pop("mainTitle", None)
        return data

`get_title_from_main_title(data)` `classmethod`

Populates the title field from mainTitle if title is not present.

The OpenAIRE API sometimes uses mainTitle for the primary title. This validator ensures that the title field in the Pydantic model is populated using mainTitle if title itself is missing in the input data, effectively aliasing mainTitle to title.

Parameters:

Name	Type	Description	Default
`data`	`Any`	The raw input data dictionary before validation.	required

Returns:

Type	Description
`Any`	The (potentially modified) input data dictionary.

Source code in src/aireloom/models/research_product.py

@model_validator(mode="before")
@classmethod
def get_title_from_main_title(cls, data: Any) -> Any:
    """Populates the `title` field from `mainTitle` if `title` is not present.

    The OpenAIRE API sometimes uses `mainTitle` for the primary title. This
    validator ensures that the `title` field in the Pydantic model is populated
    using `mainTitle` if `title` itself is missing in the input data, effectively
    aliasing `mainTitle` to `title`.

    Args:
        data: The raw input data dictionary before validation.

    Returns:
        The (potentially modified) input data dictionary.
    """
    if isinstance(data, dict) and "mainTitle" in data:
        if (
            "title" not in data or data["title"] is None
        ):  # Ensure we don't overwrite an existing title
            data["title"] = data.pop("mainTitle")
        else:  # title exists, no need to pop mainTitle if it's just a duplicate
            data.pop("mainTitle", None)
    return data

`split_keywords(v)` `classmethod`

Attempts to split a comma-separated string of keywords into a list.

If the input v is a string, it's split by commas, and each part is stripped of whitespace. If v is None or not a string, it's returned as is (or None if the string was empty after stripping).

Parameters:

Name	Type	Description	Default
`v`	`Any`	The value to parse, expected to be a string or None.	required

Returns:

Type	Description
`list[str] \| None`	A list of keyword strings, or None if input was None or empty.

Source code in src/aireloom/models/research_product.py

@field_validator("keywords", mode="before")
@classmethod
def split_keywords(cls, v: Any) -> list[str] | None:
    """Attempts to split a comma-separated string of keywords into a list.

    If the input `v` is a string, it's split by commas, and each part is stripped
    of whitespace. If `v` is None or not a string, it's returned as is (or None
    if the string was empty after stripping).

    Args:
        v: The value to parse, expected to be a string or None.

    Returns:
        A list of keyword strings, or None if input was None or empty.
    """
    if v is None:
        return None
    if isinstance(v, str):
        return [kw.strip() for kw in v.split(",") if kw.strip()]
    logger.warning(
        f"Unexpected value for ResearchProduct.keywords: {v}. Expected string or None."
    )
    return None  # Or raise ValueError if strictness is preferred

`ScholixCreator`

Bases: BaseModel

Represents a creator (e.g., author, contributor) in the Scholix schema.

Attributes:

Name	Type	Description
`name`	`str \| None`	The name of the creator (aliased from "Name").
`identifier`	`list[ScholixIdentifier] \| None`	An optional list of `ScholixIdentifier` objects for the creator.

Source code in src/aireloom/models/scholix.py

class ScholixCreator(BaseModel):
    """Represents a creator (e.g., author, contributor) in the Scholix schema.

    Attributes:
        name: The name of the creator (aliased from "Name").
        identifier: An optional list of `ScholixIdentifier` objects for the creator.
    """

    name: str | None = Field(alias="Name", default=None)
    identifier: list[ScholixIdentifier] | None = Field(alias="Identifier", default=None)

    model_config = ConfigDict(populate_by_name=True, extra="allow")

`ScholixEntity`

Bases: BaseModel

Represents a scholarly entity (source or target) in a Scholix relationship.

Attributes:

Name	Type	Description
`identifier`	`list[ScholixIdentifier]`	A list of `ScholixIdentifier` objects for the entity.
`type`	`ScholixEntityTypeName`	The `ScholixEntityTypeName` (e.g., "publication", "dataset").
`sub_type`	`str \| None`	An optional subtype providing more specific classification.
`title`	`str \| None`	The title of the scholarly entity.
`creator`	`list[ScholixCreator] \| None`	A list of `ScholixCreator` objects.
`publication_date`	`str \| None`	The publication date of the entity (string format).
`publisher`	`list[ScholixPublisher] \| None`	A list of `ScholixPublisher` objects.

Source code in src/aireloom/models/scholix.py

class ScholixEntity(BaseModel):
    """Represents a scholarly entity (source or target) in a Scholix relationship.

    Attributes:
        identifier: A list of `ScholixIdentifier` objects for the entity.
        type: The `ScholixEntityTypeName` (e.g., "publication", "dataset").
        sub_type: An optional subtype providing more specific classification.
        title: The title of the scholarly entity.
        creator: A list of `ScholixCreator` objects.
        publication_date: The publication date of the entity (string format).
        publisher: A list of `ScholixPublisher` objects.
    """

    identifier: list[ScholixIdentifier] = Field(alias="Identifier")
    type: ScholixEntityTypeName = Field(alias="Type")
    sub_type: str | None = Field(alias="SubType", default=None)
    title: str | None = Field(alias="Title", default=None)
    creator: list[ScholixCreator] | None = Field(alias="Creator", default=None)
    publication_date: str | None = Field(alias="PublicationDate", default=None)
    publisher: list[ScholixPublisher] | None = Field(alias="Publisher", default=None)

    model_config = ConfigDict(populate_by_name=True, extra="allow")

`ScholixIdentifier`

Bases: BaseModel

Represents a persistent identifier within the Scholix schema.

Attributes:

Name	Type	Description
`id_val`	`str`	The value of the identifier (aliased from "ID").
`id_scheme`	`str`	The scheme of the identifier (aliased from "IDScheme", e.g., "doi", "url").
`id_url`	`HttpUrl \| None`	An optional resolvable URL for the identifier (aliased from "IDURL").

Source code in src/aireloom/models/scholix.py

class ScholixIdentifier(BaseModel):
    """Represents a persistent identifier within the Scholix schema.

    Attributes:
        id_val: The value of the identifier (aliased from "ID").
        id_scheme: The scheme of the identifier (aliased from "IDScheme", e.g., "doi", "url").
        id_url: An optional resolvable URL for the identifier (aliased from "IDURL").
    """

    id_val: str = Field(alias="ID")
    id_scheme: str = Field(alias="IDScheme")
    id_url: HttpUrl | None = Field(alias="IDURL", default=None)

    model_config = ConfigDict(populate_by_name=True, extra="allow")

`ScholixLinkProvider`

Bases: BaseModel

Represents the provider of the Scholix link.

Attributes:

Name	Type	Description
`name`	`str`	The name of the link provider (aliased from "Name").
`identifier`	`list[ScholixIdentifier] \| None`	An optional list of `ScholixIdentifier` objects for the provider.

Source code in src/aireloom/models/scholix.py

class ScholixLinkProvider(BaseModel):
    """Represents the provider of the Scholix link.

    Attributes:
        name: The name of the link provider (aliased from "Name").
        identifier: An optional list of `ScholixIdentifier` objects for the provider.
    """

    name: str = Field(alias="Name")
    identifier: list[ScholixIdentifier] | None = Field(alias="Identifier", default=None)

    model_config = ConfigDict(populate_by_name=True, extra="allow")

`ScholixPublisher`

Bases: BaseModel

Represents a publisher in the Scholix schema.

Attributes:

Name	Type	Description
`name`	`str`	The name of the publisher (aliased from "Name").
`identifier`	`list[ScholixIdentifier] \| None`	An optional list of `ScholixIdentifier` objects for the publisher.

Source code in src/aireloom/models/scholix.py

class ScholixPublisher(BaseModel):
    """Represents a publisher in the Scholix schema.

    Attributes:
        name: The name of the publisher (aliased from "Name").
        identifier: An optional list of `ScholixIdentifier` objects for the publisher.
    """

    name: str = Field(alias="Name")
    identifier: list[ScholixIdentifier] | None = Field(alias="Identifier", default=None)

    model_config = ConfigDict(populate_by_name=True, extra="allow")

`ScholixRelationship`

Bases: BaseModel

Represents a single Scholix relationship link between two scholarly entities.

This is a core model in the Scholix schema, detailing the link provider, the type of relationship, the source entity, and the target entity.

Attributes:

Name	Type	Description
`link_provider`	`list[ScholixLinkProvider] \| None`	A list of `ScholixLinkProvider` objects detailing who provided the link.
`relationship_type`	`ScholixRelationshipType`	A `ScholixRelationshipType` object describing the nature of the link.
`source`	`ScholixEntity`	A `ScholixEntity` representing the source of the relationship.
`target`	`ScholixEntity`	A `ScholixEntity` representing the target of the relationship.
`link_publication_date`	`datetime \| None`	The date when this link was published or made available.
`license_url`	`HttpUrl \| None`	An optional URL pointing to the license governing the use of this link information.
`harvest_date`	`str \| None`	The date when this link information was last harvested or updated.

Source code in src/aireloom/models/scholix.py

class ScholixRelationship(BaseModel):
    """Represents a single Scholix relationship link between two scholarly entities.

    This is a core model in the Scholix schema, detailing the link provider,
    the type of relationship, the source entity, and the target entity.

    Attributes:
        link_provider: A list of `ScholixLinkProvider` objects detailing who provided the link.
        relationship_type: A `ScholixRelationshipType` object describing the nature of the link.
        source: A `ScholixEntity` representing the source of the relationship.
        target: A `ScholixEntity` representing the target of the relationship.
        link_publication_date: The date when this link was published or made available.
        license_url: An optional URL pointing to the license governing the use of this link information.
        harvest_date: The date when this link information was last harvested or updated.
    """

    link_provider: list[ScholixLinkProvider] | None = Field(
        alias="LinkProvider", default=None
    )
    relationship_type: ScholixRelationshipType = Field(alias="RelationshipType")
    source: ScholixEntity = Field(alias="Source")
    target: ScholixEntity = Field(alias="Target")
    link_publication_date: datetime | None = Field(
        alias="LinkPublicationDate",
        default=None,
        description="Date the link was published.",
    )
    license_url: HttpUrl | None = Field(alias="LicenseURL", default=None)
    harvest_date: str | None = Field(alias="HarvestDate", default=None)

    model_config = ConfigDict(populate_by_name=True, extra="allow")

`ScholixResponse`

Bases: BaseModel

Response structure for the Scholexplorer Links endpoint.

Source code in src/aireloom/models/scholix.py

class ScholixResponse(BaseModel):
    """Response structure for the Scholexplorer Links endpoint."""

    current_page: int = Field(
        alias="currentPage", description="The current page number (0-indexed)."
    )
    total_links: int = Field(
        alias="totalLinks", description="Total number of links matching the query."
    )
    total_pages: int = Field(
        alias="totalPages", description="Total number of pages available."
    )
    result: list[ScholixRelationship] = Field(
        alias="result", description="List of Scholix relationship links."
    )

    model_config = ConfigDict(populate_by_name=True, extra="allow")

Models

DataSourceResponse = ApiResponse[DataSource] module-attribute

OrganizationResponse = ApiResponse[Organization] module-attribute

ProjectResponse = ApiResponse[Project] module-attribute

ResearchProductResponse = ApiResponse[ResearchProduct] module-attribute

ApiResponse

handle_null_results(v) classmethod

BaseEntity

ControlledField

Country

DataSource

Funding

FundingStream

Grant

H2020Programme

Header

coerce_str_to_int(v) classmethod

Organization

OrganizationPid

Project

parse_keywords_string(v) classmethod

ResearchProduct

get_title_from_main_title(data) classmethod

split_keywords(v) classmethod

ScholixCreator

ScholixEntity

ScholixIdentifier

ScholixLinkProvider

ScholixPublisher

ScholixRelationship

ScholixResponse

`DataSourceResponse = ApiResponse[DataSource]` `module-attribute`

`OrganizationResponse = ApiResponse[Organization]` `module-attribute`

`ProjectResponse = ApiResponse[Project]` `module-attribute`

`ResearchProductResponse = ApiResponse[ResearchProduct]` `module-attribute`

`ApiResponse`

`handle_null_results(v)` `classmethod`

`BaseEntity`

`ControlledField`

`Country`

`DataSource`

`Funding`

`FundingStream`

`Grant`

`H2020Programme`

`Header`

`coerce_str_to_int(v)` `classmethod`

`Organization`

`OrganizationPid`

`Project`

`parse_keywords_string(v)` `classmethod`

`ResearchProduct`

`get_title_from_main_title(data)` `classmethod`

`split_keywords(v)` `classmethod`

`ScholixCreator`

`ScholixEntity`

`ScholixIdentifier`

`ScholixLinkProvider`

`ScholixPublisher`

`ScholixRelationship`

`ScholixResponse`