Skip to content

Documented example of extending TransformComponent is incorrect #1353

Open
@brettimus

Description

@brettimus

Issue in apps/docs/docs/modules/ingestion_pipeline/transformations.md

There is an issue with an example of ingestion/transformation on the documentation site.

Regarding custom transformations (website here | doc page in repo here), it's shown you need to implement a transform method on an extension of the TransformComponent class:

import { TransformComponent, TextNode } from "llamaindex";

export class RemoveSpecialCharacters extends TransformComponent {
  async transform(nodes: TextNode[]): Promise<TextNode[]> {
    for (const node of nodes) {
      node.text = node.text.replace(/[^\w\s]/gi, "");
    }

    return nodes;
  }
}

Then use it like this:

async function main() {
  const pipeline = new IngestionPipeline({
    transformations: [new RemoveSpecialCharacters()],
  });

  const nodes = await pipeline.run({
    documents: [
      new Document({ text: "I am 10 years old. John is 20 years old." }),
    ],
  });

  for (const node of nodes) {
    console.log(node.getContent(MetadataMode.NONE));
  }
}

However, the implementation of TransformComponent expects a transform function to be passed to the constructor (code here, in packages/core/src/schema/types.ts)

export class TransformComponent {
  constructor(transformFn: TransformComponentSignature) {
    Object.defineProperties(
      transformFn,
      Object.getOwnPropertyDescriptors(this.constructor.prototype),
    );
    const transform = function transform(
      ...args: Parameters<TransformComponentSignature>
    ) {
      return transformFn(...args);
    };
    Reflect.setPrototypeOf(transform, new.target.prototype);
    transform.id = randomUUID();
    return transform;
  }
}

This means we get a type error when using the code from the docs:

image image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions