We'll discuss test sets in greater detail when we cover testing the NLU model. This technique isn't related to software testing, in and of itself. Instead of testing whether a function returns the expected output or whether a UI is glitchy, these tests measure whether the machine learning models are making correct predictions.Ī quick note on terminology: In machine learning, we have the concept of a test set, a portion of labeled data held aside from training to measure the accuracy of the model's predictions. We've talked about tests in the traditional software sense, but next we'll discuss tests that are specific to Rasa. For example, those who build web-based applications might be familiar with tools like Selenium, which automates actions in the browser. Automated functional tests are often run by simulation software because they involve testing the GUI. Functional testing checks how the entire application performs, with all of the pieces working together. Functional testsįunctional tests sit one layer higher than integration tests. An integration test typically tests just one feature set or workflow, for example, whether a user can log in to their account. However, they don't test the entire application. Integration tests operate at a higher level than unit tests, by evaluating how parts of the application work together. You could say a unit test works by asking a function a question it already knows the answer to. The unit test feeds these values into the function and then verifies the output against the known value that should be returned-13.18 USD, in this case. Instead, we supply a mock value, say a rate of 1.10. Unit tests are performed in isolation from other systems, so we can't call the real API to get the conversion rate. We start with a known input, of say, 12 Euros. To make sure it works, we would write a unit test, a second function that checks to be sure the first function is returning the expected value. As an example, imagine a function that calls an API to get the latest currency conversion rates and then converts an amount in Euros to USD. Unit tests evaluate the smallest and most specific pieces of code, usually individual functions or methods. Software tests follow a hierarchy, moving from granular tests that assess small pieces of code, to higher level tests that evaluate how the entire system works together. Automated tests don't completely erase the need for manual tests, but they do identify a significant number of bugs before they reach production, without additional human effort. ![]() Whereas manual tests require a human to evaluate the software by actually using it, automated tests typically run on a CI/CD server after changes are checked into an application's Git repository. In this post, we'll focus on automated testing. Put simply, when you test software, you're making sure the changes you're introducing a) do what they're supposed to do, and b) haven't broken anything else in the application. ![]() Testing Overviewīefore we cover tests that are specific to Rasa and machine learning, let's first take a broader look at testing in software development. The end result? Less time spent chasing down bugs and more reliable updates to your assistant. In this post, we'll explore what testing looks like in the context of building an AI assistant, how to run tests with Rasa, and how to incorporate testing into your CI/CD pipeline. To that end, we've recently released updates to Rasa Open Source, Rasa X, and our documentation to make testing the easy default when building AI assistants with Rasa. We want to make engineering best practices like version control, CI/CD, and yes-testing-the de facto standard for building AI assistants. We believe that while AI assistants have come a long way in recent years, there's still a big gap between the way many product teams build other types of software and the way teams build AI assistants. But while it's sometimes overlooked, testing is an important part of releasing software that behaves the way you (and your users) expect.Īt Rasa, we're on a mission to give testing the respect it deserves. Compared to the excitement of shipping a new feature, software testing doesn't always get the same kind of love.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |