Conflict resolutions preparing for feature-refapp merge into main #289

adrianwyatt · 2023-04-03T18:27:00Z

No description provided.

### Motivation and Context SK package should not depend on analyzers. Looks like xunit.analyzers pkg is somehow differently and the XML needed wasn't added automatically like other analyzers. ### Description Edit Directory.Packages.props and remove xunit.analyzers dependency from the nuget. I also noticed a couple of duplicate files left over from a bad merge (a folder rename), the PR takes care of deleting them. Co-authored-by: Lee Miller <[email protected]>

…Functions Retrieve() and Remove() (#35) ### Motivation and Context Currently, TextMemorySkill `Recall` only returns the most relevant memory despite being reliant on methods that support returning N relevant memories. Additionally, users should be able to `Remove` memories. ### Description This PR introduces new terminology to TextMemorySkill: `Remove` and `Retrieve`. `Remove` and 'Retrieve' allows the TextMemorySkill get and delete memories from storage by unique key (since memories are stored as key-value pairs). Users may Retrieve/Remove a single memory using the unique key associated with the memory. Redefined function: - `Recall` -> look up a number of memories relevant to a given idea string. The number of results depends on a `LimitParam` and `RelevanceParam` provided by the SKContext. The results text records are concatenated to a single Json string. New functions: - `RetrieveAsync` -> look up a specific memory associated with a unique key. - `RemoveAsync` -> delete a specific memory associated with a unique key. Additional Changes: - Modified KernelSyntaxExample `Example15_MemorySkill` to use the new functions and features. - Added `RemoveAsync` to `ISemanticTextMemory` and its implementors `SemanticTextMemory` and `NullMemory` **Note:** If the result of 'Recall' will be used for a completion skill, this could result in unexpected behavior if the text for N memories exceeds the model's token limit. Addressing input chunking to a completion model is not in scope for this PR.

### Description authenication -> authentication

Bumps [Microsoft.Data.Sqlite](https://github.com/dotnet/efcore) from 7.0.3 to 7.0.4. Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…in /dotnet (#126) Bumps [Microsoft.Identity.Client.Extensions.Msal](https://github.com/AzureAD/microsoft-authentication-extensions-for-dotnet) from 2.26.0 to 2.27.0.

… /dotnet (#127) Bumps [Microsoft.Extensions.Configuration.Binder](https://github.com/dotnet/runtime) from 7.0.3 to 7.0.4.

### Motivation and Context This sample continues in the spirit of prior samples in the samples/apps folder with a goal of teaching developers how the concepts of memory and embeddings can be integrated. ### Description This sample includes a frontend app that allows the developer to enter the address of a publicly available Github repository, which is downloaded to the temp directory of the machine hosting the KernelHttpService sample application (can be the same machine). Once downloaded the developer is able to have a conversation with the repository and see the impact of the relevance slider on the responses. --------- Co-authored-by: Craig Presti <[email protected]> Co-authored-by: teresaqhoang <[email protected]> Co-authored-by: markwallace-microsoft <[email protected]> Co-authored-by: Tao Chen <[email protected]> Co-authored-by: Mark Wallace <[email protected]> Co-authored-by: Devis Lucato <[email protected]> Co-authored-by: Devis Lucato <[email protected]> Co-authored-by: Adrian Bonar <[email protected]>

Improve developer experience in VSCode. Today, VSCode will auto apply fixes for style rules listed here: https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/style-rules/unnecessary-code-rules These result in changes the team does not want to be included in our style and creates a pain in development flow. Additionally, launchsettings.json for `kernel-syntax-examples` will override any user configured environment variables. To reduce the burden on folks already set up for this development flow, we can update this file.

Today, there is no sample skill that tells whether a given sentence is a question or statement. SK can now use the new skill to classify if a sentence is a question

It is not obvious to users that Sample 4 currently only processes markdown files. 1. Add a paragraph in the sample 4 readme that points out the app only processes markdown files. 2. Add a tooltip in the repo selection page that informs users only the files of the specified type will be processed.

… FunctionView (#67) ### Motivation and Context This pull request introduces several changes to the planner skill, which is a core skill that allows users to create and execute plans based on semantic queries. The motivation and context for these changes are as follows: - The changes improve the functionality and usability of the planner skill by adding several parameters and options to the planner skill functions, such as relevancy threshold, max functions, excluded functions, excluded skills, and included functions. These parameters allow users to customize the filtering and selection of registered functions based on their goals and preferences. - The changes enable the planner skill to filter the available functions based on a semantic query and the relevance score from the memory provider. This allows users to specify a natural language query that matches their intent and get the most relevant functions to execute. This feature enhances the planner skill's ability to generate and execute plans that are relevant, efficient, and customizable for different goals and scenarios. - The changes improve the user experience of the kernel syntax examples by adding more semantic skills and demonstrating how to use the planner skill to create a plan for a complex task. These changes do not fix any open issues, but they are related to the ongoing development and improvement of the planner skill and the kernel framework. They also improve the test coverage and code quality of the planner skill, which is a key feature of the semantic kernel framework.

This commit renames the method DownloadToFileAsync to DownloadToFile in the WebSkillTests class, to match the name of the method in the WebFileDownloadSkill class. This change fixes the `WebFileDownloadSkillFileTestAsync` integration test. No functional changes are made to the method or the test.

### Motivation and Context Add GPT2/GPT3 tokenizer, to allow counting tokens when using OpenAI. ### Description Add C# port of the tokenizers recommended by OpenAI. See https://platform.openai.com/tokenizer for more info.

#178) Bumps [Microsoft.Azure.Functions.Worker](https://github.com/Azure/azure-functions-dotnet-worker) from 1.10.1 to 1.13.0.

### Bugfix Minutes and Seconds returning wrong information

) ### Motivation and Context Remove hardcoded models, allow to use ChatGPT, Image generation like DallE, any/custom Text Generation, and Text Embedding Currently SK supports only "text completion" and "text embedding", and only OpenAI and Azure OpenAI. One cannot use custom LLMs. If one wants to put a proxy in front of OpenAI with a custom API, SK doesnt' support it. "Completion" and "Embedding" are not explicitly scoped to "text", while the implementation supports only text completion and text embeddings. Also SK has no helpers for DallE and ChatGPT. ### Description The PR contains the following features and improvements: * Introduce the concept of Service Collection (without using .NET ServiceCollection yet, because that would take more work). The service collection can cointain 4 service types: * Text Completion services * Text Embedding generators * Chat Completion services * Image Generation services * Add client for OpenAI ChatGPT, and example (example 17) * Add client for OpenAI DallE, and example (example 18) * Show how to use a custom LLM, e.g. in case of proxies or local HuggingFace models (example 16) * Add Secret Manager to syntax examples, to reduce the risk of leaking secrets used to run the examples * Add 3 new syntax examples: custom LLM, ChatGPT, DallE

1. Why is this change required? The current example prompts at the end of the sample don't flow well - the first prompt is asking for generic ideas while the second prompt assumes that just one idea was output. The third one, abruptly switches to asking about a book with no mention of a book in the previous two prompts and responses. 3. What problem does it solve? More intuitive and realistic chat experience for a new developer trying out the sample for the very first time. 5. What scenario does it contribute to? Better developer experience while conveying the power of related (follow up) prompts aligned with a more realistic flow. ### Description Cosmetic, non-code changes ensuring that this particular example also flows well just like the other examples.

### Motivation and Context  Most modern operating systems, including macOS, Linux, and Unix, use `\n` as the line-ending character. Windows, on the other hand, uses `\r\n` as the line-ending character. However, when sending data over the internet, `\n` is the standard line-ending character. Therefore, if you're working on a Windows machine and your text contains `\r\n`, you should convert it to `\n` before sending the prompt to OpenAI API. This will ensure that the API can correctly interpret the line endings in your prompt. Prompt example: ``` Given a json input and a request. Apply the request on the json input and return the result. Put the result in between <result></result> tags Input: {\"name\": \"John\", \"age\": 30} Request: name ``` Expected completion: (this is what happens when using `\n` for line endings) ``` <result>John</result> ``` Actual completion: ``` " to upper case. <result>\"name\": \"JOHN\", \"age\": 30}</result> ``` ### Description  1. Implemented `NormalizePrompt` method in `OpenAIClientAbstract` class. 2. Added usage of new method in `AzureTextCompletion` and `OpenAITextCompletion` classes. 3. Added integration tests for verification.

### Motivation and Context Latest version 0.9 has some new configuration APIs and breaking changes that require the notebooks to be updated ### Description Use new SK API in the notebooks

### Motivation and Context Update README with new API. Lock notebooks to 0.9 so we can release new APIs breaking notebooks.

I was trying to run the Getting Started Notebook from step 3 on the [Setting up Semantic Kernel](https://learn.microsoft.com/en-us/semantic-kernel/get-started) documentation page. In the first cell, I got a nuget error: `error NU1101: Unable to find package ...` Incorporated a fix from https://stackoverflow.com/a/73961223/620501 into the troubleshooting section of the readme.

### Motivation and Context This pull request/PR review is to add the ability for the Semantic Kernel to persist embeddings/memory to external vector databases like Qdrant. This submission has modifications and additions to allow for integration into the current SK Memory architecture and the subsequent SDK/API of various Vector Databases. **Why is this change required?** This change is required in order to allow SK developers/users to persist and search for embeddings/memory from a Vector Database like Qdrant. **What problem does it solve?** Adds capabilities of Long Term Memory/Embedding storage to a vector databases from the Semantic Kernel. **What scenario does it contribute to?** Scenario: Long term, scalability memory storage, retrieval, search, and filtering of Embeddings. **If it fixes an open issue, please link to the issue here.** N/A ### Description This PR currently includes connection for the Qdrant VectorDB only, removing the initial Milvus VectorDB addition and VectorDB client interfaces for consistency across various external vector databases, which will be provided in forthcoming PR. - Addition and Modification of Qdrant.Dotnet SDK - Addition of new namespace Skills.Memory.QdrantDB - Creating/Adding Qdrant Memory Client class and QdrantMemoryStore.cs. Adding methods for connecting, retrieving collections and embeddings from Qdrant vector database in cloud.

### Motivation and Context Cleaning up tech debt as per plan, last couple of months * Moved OpenAI code to Connectors namespace * Label => Service Id * Backend => Service / Connector * EmbeddingGenerator => EmbeddingGeneration / TextEmbeddingGeneration

**Motivation and Context** We want to be able to compress and decompress file. **Description** Add the FileCompressionSkill in dotnet samples folder

### Motivation and Context With the `0.9.61.1-preview` nuget package, the API adding the text completion model needs to be updated.

### Motivation and Context [HuggingFace](https://huggingface.co/) is a platform with over 120k different models, so we need to provide a way to use these models with Semantic Kernel. While users can implement their own backend implementation for HuggingFace models, we can provide default implementation which will be ready for usage out-of-the-box. This PR is for merging [experimental-huggingface](https://github.com/microsoft/semantic-kernel/tree/experimental-huggingface) branch to `main` from new branch in order to avoid a lot of merge conflicts. ### Description PR includes: 1. Local inference server for running models locally with Python. 2. Implemented `HuggingFaceTextCompletion` and `HuggingFaceTextEmbeddingGeneration` classes. 3. `HuggingFaceTextCompletion` works with local models as well as remote models hosted on HF servers. 4. Unit and integration tests for new classes. 5. HuggingFace usage example in `KernelSyntaxExamples` project.

### Motivation and Context Reduce the risk of checking in secrets, removing the ".env" files in the repo. ".env" is already in .gitignore, but some files were checked in earlier. ### Description Rename all .env to .env.example.

… classes (#250) ### Motivation and Context `MemoryRecord`, along with several useful classes in the `SemanticKernel.Memory.Collections` namespace currently have an internal visibility modifier. This prevents 3rd party libraries from reusing code when creating implementations of `IEmbeddingWithMetadata<float>`, which is needed in implementations of `IMemoryStore<TEmbedding>` Two implementations are currently blocked on this: * Skills.Memory.Sqlite * Skills.Memory.CosmosDB As both of these backing stores are not vector DB stores, they implement cosine similarity comparisons locally, which is performed using `TopNCollection` and its dependencies, all of which are currently internal. Related community feedback item here: #202 ### Description Change the following classes from internal to public: * MemoryRecord * MinHeap * Score * ScoredValue * TopNCollection

### Motivation and Context When using a database backed memory store, there is typically a unique id constraint. The current code will supply the same id when there are multiple paragraphs that have been summarised for the same file. ### Description This PR appends a number to the end of each id to guarantee uniqueness.

This change removes unnecessary destructor/finalizer declared by OpenAIClientAbstract class and inherited by the backend classes implementing it. Finalizers are usually used to release/clean **unmanaged** resources referenced directly through OS handles. Taking into account that neither of SK SDK backends is using any unmanaged resource directly, the finalizer is not really needed and having the IDisposable.Dispose method is enough to release **managed** resources. There's also a needless loss of performance associated with class finalizers - every instance of the class that implements the finalizer will be put into a special Finalize queue by garbage collector (GC), then the queue got processed by GC, and only on the next GC cycle the instances will be removed from RAM.

Fix README to reflect the content of the latest nuget (AddAzureOpenAITextCompletion call and required libs)

dluc and others added 30 commits March 20, 2023 11:12

Fix typo in Getting-Started-Notebook.ipynb (#109)

adc557f

### Description authenication -> authentication

Bump Microsoft.Identity.Client.Extensions.Msal from 2.26.0 to 2.27.0 …

b809964

…in /dotnet (#126) Bumps [Microsoft.Identity.Client.Extensions.Msal](https://github.com/AzureAD/microsoft-authentication-extensions-for-dotnet) from 2.26.0 to 2.27.0.

Bump Microsoft.Extensions.Configuration.Binder from 7.0.3 to 7.0.4 in…

3948e58

… /dotnet (#127) Bumps [Microsoft.Extensions.Configuration.Binder](https://github.com/dotnet/runtime) from 7.0.3 to 7.0.4.

Add new Question function to existing ClassificationSkill samples (#89)

2b47dce

Today, there is no sample skill that tells whether a given sentence is a question or statement. SK can now use the new skill to classify if a sentence is a question

Add OpenAI GPT tokenizer class (#148)

8f6dcb9

### Motivation and Context Add GPT2/GPT3 tokenizer, to allow counting tokens when using OpenAI. ### Description Add C# port of the tokenizers recommended by OpenAI. See https://platform.openai.com/tokenizer for more info.

Bump Microsoft.Azure.Functions.Worker from 1.10.1 to 1.13.0 in /dotnet (

72ebfa2

#178) Bumps [Microsoft.Azure.Functions.Worker](https://github.com/Azure/azure-functions-dotnet-worker) from 1.10.1 to 1.13.0.

Small fix to TimeSkill minutes and seconds (#184)

ed447a6

### Bugfix Minutes and Seconds returning wrong information

Upgrade notebooks to new nuget version (#194)

8bbd432

### Motivation and Context Latest version 0.9 has some new configuration APIs and breaking changes that require the notebooks to be updated ### Description Use new SK API in the notebooks

Update README and lock notebooks to 0.9 (#206)

ab80c7c

### Motivation and Context Update README with new API. Lock notebooks to 0.9 so we can release new APIs breaking notebooks.

Add FileCompression skill (#82)

ad77101

**Motivation and Context** We want to be able to compress and decompress file. **Description** Add the FileCompressionSkill in dotnet samples folder

Update API for getting-started-notebook (#221)

b86e2dd

### Motivation and Context With the `0.9.61.1-preview` nuget package, the API adding the text completion model needs to be updated.

Remove files that could contain secrets (#244)

bcd740b

### Motivation and Context Reduce the risk of checking in secrets, removing the ".env" files in the repo. ".env" is already in .gitignore, but some files were checked in earlier. ### Description Rename all .env to .env.example.

Qworg and others added 4 commits March 31, 2023 18:14

Update README.md (#263)

cf1d2b4

Fix README to reflect the content of the latest nuget (AddAzureOpenAITextCompletion call and required libs)

Fix nuget, include tokenizer data files into the output dir (#269)

9a825d0

Merge branch 'microsoft-feature-refapp' into feature-refapp

3d038f9

Merge branch 'main' into feature-refapp

8377370

adrianwyatt changed the title ~~Feature refapp~~ Conflict resolutions preparing for feature-refapp merge into main Apr 3, 2023

hathind-ms approved these changes Apr 3, 2023

View reviewed changes

adrianwyatt merged commit df48f07 into microsoft:feature-refapp Apr 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conflict resolutions preparing for feature-refapp merge into main #289

Conflict resolutions preparing for feature-refapp merge into main #289

adrianwyatt commented Apr 3, 2023

Conflict resolutions preparing for feature-refapp merge into main #289

Conflict resolutions preparing for feature-refapp merge into main #289

Conversation

adrianwyatt commented Apr 3, 2023