Description
Past issues indicate this is a recurring problem for others (and I!).
Any fix will have to be a bit intrusive and involves some opinionated design choices.
Hopefully this issue will serve to discuss possible solutions.
Occurrence of "multiple instances" in a schema
Multiple instances in the schema occur when the exported name of a type is identical. This can happen when the same TypeName is exported from two different files. There are three cases of this in valid TypeScript programs that I have encountered:
Case 1: Chained inheritance:
// base.ts
export interface MyObject {
a: string;
}
// intermediate.ts
import * as Base from "./Base";
export interface MyObject extends Base.MyObject {
b: string;
}
// main.ts
import * as Intermediate from "./Intermediate";
export interface MyObject extends Intermediate.MyObject {}
Case 2: Composition:
// ComponentA.ts
export interface MyObject {
a: string;
}
// ComponentB.ts
export interface MyObject {
b: string;
}
// main.ts
import * as A from "./componentA";
import * as B from "./componentB";
export interface MyObject {
a: A.MyObject;
b: B.MyObject;
}
Case 3:
The duplicates
test case. This is in principle the same as case 2, but a different example, so worth testing for.
Root cause
These are all valid TypeScript programs, that should have valid generatable JSON-schemas, but our favorite schema-generator barfs. The best I can understand, it's because the generator stores the Type as the "name" in the file it is defined, and loses the context of the file path. Within the TypeScript AST these are independent nodes, bound to a sourceFile, allowing for disambiguation when necessary. Since the Type constructors do not store the node, we lose this ability at the point of generation.
Importantly, we only need the "fully qualified name" in case of a conflict. The "simple name" should suffice in the vast majority of case.
Possible Solution:
(References the POC implementation)
- Use
getId()
instead ofgetName()
to generate all references initially - DefinitionTypeFormatter & ReferenceTypeFormatter - Build a schema using these, but also create an
idNameMap
, which uses maps theid
to it's unambigiousName - The
unambiguousName
is identical togetName()
when there is no conflict, and uses the smallest possible prefix computed from sourceFileName deltas between all collisions. RootTypes grab thegetName()
. - The schema is constructed as before, removing undefined and unreachable definitions. Once done, a
resolveIdRefs
recursive walk uses the idNameMap to fix the schema up.
(if this sounds complicated - a proof-of-concept PR is coming right behind the issue being filed)
Opinionated parts:
- Disambiguation segment: This should be the smallest possible string that allows for proper disambiguation and makes sense to the author/users of the TypeScript-code/schema. One option would be to consider the import path that would be needed. However, many a time, this will include an trailing
index.ts
which is superfluous for our purpose. Given conflicting names, I'd like to propose removing the common-prefixes and any trailingindex.ts
to arrive at the disambiguation string. - Path separator: since the json-schema and all related tooling is built around the json-ptr, using a "/" will cause all kinds of down-stream trouble in using these schemas. I'd like to propose using
-
which is URL safe, easy on the humans, and doesn't conflict with TypeScript variable naming conventions.
Examples:
- Case This generator does not support type generate of typeof array item #1 from above, duplicates-inheritance yields
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$ref": "#/definitions/MyObject",
"definitions": {
"MyObject": {
"type": "object",
"required": [
"a",
"b",
"c"
],
"properties": {
"a": {
"type": "string"
},
"b": {
"type": "string"
},
"c": {
"type": "string"
}
},
"additionalProperties": false
}
}
}
- Case Maximum call stack size exceeded #2 from above, duplicates-composition yields
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$ref": "#/definitions/MyObject",
"definitions": {
"MyObject": {
"type": "object",
"required": [
"a",
"b"
],
"properties": {
"a": {
"$ref": "#/definitions/componentA-MyObject"
},
"b": {
"$ref": "#/definitions/componentB-MyObject"
}
},
"additionalProperties": false
},
"componentA-MyObject": {
"type": "object",
"required": [
"a"
],
"properties": {
"a": {
"type": "string"
}
},
"additionalProperties": false
},
"componentB-MyObject": {
"type": "object",
"required": [
"b"
],
"properties": {
"b": {
"type": "string"
}
},
"additionalProperties": false
}
}
}
- Case Incorrect output for Generic with UnionType #3 duplicates yields:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$ref": "#/definitions/MyType",
"definitions": {
"MyType": {
"anyOf": [
{
"$ref": "#/definitions/import1-A"
},
{
"$ref": "#/definitions/import2-A"
}
]
},
"import1-A": {
"type": "number"
},
"import2-A": {
"type": "string"
}
}
}
Cons
- This likely has a slight performance hit - since we walk the schema one more time as a post process step. But that is not different than the walk performed by
removeUnreachable
.
Pros
- We'll generate schemas for a larger subset of TypeScript programs.
- Since we bind to the filename, reuse of definitions should work irrespective of how they are aliased at point of use in the TypeScript files