
Do you live in Moscow and drive your car? If so, how do you pay for parking? Send SMS? Pay through the Moscow Parking app? Use the bot in Telegram? “This is all uncomfortable,” I decided, and created my skill for Alice to pay for parking by voice. In addition, Alice is already built into Yandex.Navigator. Now you can just tell the Navigator something like "Alice, ask
Moscow Parking to pay parking 3209 for 30 minutes ."
What did I encounter when developing a skill?
Session
In order to start a skill from a third-party developer, Alice needs to say "Alice, launch the skill such and such." This is good and convenient if you have a long communication with the skill - for example, the game starts. If you need to say a phrase, get an answer and that’s all, then “entering a skill” is inconvenient - with this input, when you finish working with a skill, you will need to “exit” it.
For one phrase, there is a solution from Alice’s developers - you need to say, “Alice ask for the skill to
do such and such .” However, the
python example from Alice’s developers doesn’t support such a skill launch:
if req['session']['new']:
Each time a skill is launched, incl. and with the command "Alice ask for a
skill ...", the value of
session.new is
True . Therefore, all further processing code will not be executed. The solution is to check the text of
session.command - it should be empty.
By the way, if you "entered" a skill, then by default for all skills Alice supports exit phrases - "Alice, enough" and "Alice, come back." If you want to forcefully end the session with the skill, you must pass
end_session equal to
True in the response. But this only works with Yandex.Station - on other devices, exiting the skill in this case does not work.
Work with numbers
My skill requires working with numbers - first, recognize the user's phone number, then - recognize the parking number.
In the above example from Yandex used
req['request']['original_utterance'].lower()
to work with a user request. First, I used this field from the request. In order to recognize the user's phone number, I had to ask the user to name each digit of the number separately - for example, “nine one six one two three four five six seven”. And in the code - replace text values ("nine") with numeric (9). It turned out even funnier with the parking code - I call the code “3209” as “thirty two zero nine”, the code had a bunch of replacements like
s.replace(' ', '32').replace(' ', '31').replace('', '0')
Given that based on the request text in the skill code, I tried to understand what the user wants (the state machine is not used in the skill), I had to do this conversion with almost every (!) User request.
It turned out that Alice’s servers are already doing everything for you (and even more). Just instead of
request.original_utterance you need to use
request.command . Yes, this is stated in the documentation. In the tooltip in the answer example.
Service field: user request converted for Alice's internal processing. During the conversion, the text, in particular, is cleared of punctuation marks, and the numerals are converted to numbers.
It is strange that in the example from Alice’s developers (link above) the original text is used (
request.original_utterance ). In fact, even more is being done in
request.command (which is not described in the documentation). For example, a phone number is converted to the format
(916) 123-45-67 - now the user in my skill can call the phone in any format convenient to him. The phrases “Alice, ask for a skill like that”, “Alice” messages are also cut out, typos are corrected.
On Alice's side, individual parts of requests (numbers, names, addresses, dates) can be converted to
Named Entities . But it works weird. Request
79161234567 1234 , in named entities converted to two numbers -
791612345 70 and
1234 . Why the first number turns out to be different, it was not possible to find out - those. Yandex.Dialog support is still thinking about the answer.
Skill Response Time
Alice waits for a response from the skill within 3 seconds (Google has this limit -
5/10 seconds ). My skill is accessing third-party servers to start and finish parking. They answer slowly. Sometimes in 3 seconds my skill does not have time to give an answer. Individually, it was not possible to increase the time for the response of the skill. Therefore, in some cases, it was necessary to sacrifice convenience - for example, at the start of parking, the skill does not request the actual car specified in the profile of the Moscow Parking application, but uses the one that it retained during user authorization.
Well, on the rights of "I am PR";) -