Abstract
In order to make it easier to plan a trip, the GetGoing system gives seniors transportation information in a way that is usable and understandable. It is accessed through a phone number provides information about bus schedules, driving, biking and more at a pace that is easier to understand. GetGoing can be paused so that the information can be written down. The system is being personalized so that preferences, such as driving vs taking the bus, are remembered and do not have to be entered at each call. This year the system will be modified to adapt to the way that the user speaks to make its output more like that of the user and thus more understandable. References to monuments will also be added so that the user can express a destination in the way they normally would, such as "my son's house" or "where the Isalys used to be".
Description
Many Americans own smartphones – at present 77%, up from just 35% in Pew Research Center’s 2011 first survey of smartphone ownership (PEW 2017). Getting trip information from providers like Google is difficult, if not impossible, for the 33% of adults without a smartphone, as such services can only be accessed via desktop, laptop or smartphone. The access gap is larger for the portion of the American population that has a low level of education (only 54% have smartphones), and larger yet for people over 65 -- only 42% of seniors have smartphones!
Yet these folks need to get around just as much as everyone else does and without access to transportation services, mobility is limited. To this end, we have developed GetGoing, a phone-based trip planning agent for seniors. GetGoing has been developed specifically with consideration for senior users, as such, it is different from other voice-based agents in its comprehensible delivery of information (e.g., paused speech, segmented information). In the future, the system will be able to adapt to the habits and preferences of each individual traveler.
GetGoing is based on our prior work on the Let’s Go and 311 systems. The target geographic area for GetGoing is Pittsburgh. The proof of concept GetGoing, demonstrated in early November 2018, gives the caller access to bus information and driving directions, in a comprehensible conversational manner. During Y1, it is being expanded to include biking information as well. Other transport such as Uber, Lyft and Access will be added later. During the 2019 spring semester, this agent is scheduled to be presented to test groups of members of OSHER Life Long Learning at CMU and AARP Chapter Presidents. They will try the system for themselves by calling its local phone number and provide feedback and feature requests, in an effort to improve the system going forward. The presentation of the agent includes a comparison to the same agent without the senior-friendly features. Our first demonstration to the President of the CMU OSHER chapter was very successful (see attached letter of support).
Seniors can benefit from a phone-based agent provided that it adapts its speech to their changing listening and cognitive abilities. As we age, we are:
- able to process less information in a given period of time,
- less able to focus on more than one thing at a time, and
- less able to remember large amounts of information.
We have addressed these three issues in the proof of concept system by:
- Slowing the speaking rate – seniors need more time to process information, they do not necessarily need the information to be louder.
- Saying something to get their attention before giving important information
- Uttering short phrases, each with one piece of information
So far, our work has centered on improving what the agent says and how it says it. In Y2, we will change our focus to concentrate on the user, adapting to each user. This will involve modeling the user, and modifying system behavior to make it more efficient, more accurate, having it respond in the most effective way given the individual capacities of the user (for example, needing information repeated often). The agent will adapt its output to the terms that users employ in order to make itself more understandable (such as incorporating the use of references to monuments to tell the agent the start and endpoints of a journey). Monuments can be any physical item (a building – Newell Simon Hall, a thing – Kaufmann’s clock, a place of business – Craig Street Starbucks). Monuments do not necessarily need to exist at present (Kaufmann’s clock). It will also adapt to the speed at which the user wants to get information (with the system waiting for them to write things down or not). Other issues that will be addressed are being able to handle multiple calls at one time on one server, knowing what the user’s usual point of departure and mode of transportation are, modifying some of the output of the Google backend to make it easier to understand, and finding speech synthesis voices that are best understood by the target population.
To recap, the proof of concept system will be modified in Y2 by:
- incorporating monuments as points of departure and arrival: this involves getting a reliable dataset of monuments with links to streets and corners, connecting it to the natural language understanding module (NLU), then modifying the NLU (slots and intents) to appropriately use this data, then doing the same for the natural language generation module (NLG).
- entraining (adapting) to the words and expressions that the user employs: this involves setting up a confidence score thresholding mechanism on the result coming from the automatic speech recognition module (ASR) and using that to determine reliably recognized words that were not used by the system’s NLG, putting those words into the utterances used by the NLG at the grammatically and semantically appropriate places.
- adapting to the speed at which the user speaks and at which the user wants the information delivered: this involves using some software like OpenSmile to detect the rhythm of speech of the user and then inserting or removing pauses in the synthetic speech accordingly.
- being available to multiple users at the same time: this involves modification of the manner in which the server is used and possible purchase of other equipment.
- personalization to user’s points and modes of transport: this involves creating a user profile for each user, which at first will only be for the ongoing call and then will become a permanent profile. For the permanent profile we will use callerid to give us the origin of the call and encode this to create a unique caller identification which will have a module attached to it that has information such as speed of speech, need for time to write things down, preferred mode of transport, etc – this module will be able to be updated automatically from each ensuing dialog.
- making the information from the Google backend easier to understand: this involves eliminating abbreviations and repetitions, making St into “street” and not “saint”, etc.
- having a very understandable voice: this involves the comparison of several different possible system voices and then the installation and testing of the chosen voice.
- allowing continuation of trips. This will exist in two different forms, with different use cases. The first form is when the user can call the system and ask for the next direction (e.g., "I'm at Forbes now, what's my next bus?"). The system should remember the trip the user previously requested, rather than requiring the user to say their destination again. The second form of continuation would allow the user to keep the call going and request the next step as they need it. This second form would be most useful when driving, as the user can request the next step after completing the previous one.
Timeline
Months 1-3: make the system usable by many callers at the same time, install barge-in, begin adaptation
Months 4-6: test changes from Months 1-3, start automatic adaptation, work on system voice, understandability issues
Months 7-9: use monuments as terms of reference, continuation of trips
Months 10-12: system assessment with real users from AARP and OSHER, presentations to new users, Tech Transfer
Strategic Description / RD&T
Deployment Plan
In Y2, GetGoing will be tested by several panels of users. With those results, and with feedback from CMU Tech Transfer, the GetGoing team will make the decision of whether to:
- spin off the system
- find some support that will enable us to keep the system live as a CMU system. AARP and OSHER cannot provide funding for this, so we will look to NSF or another source for this
- license the technology to companies that can use it on their smart agents to make them more understandable for seniors.
Expected Outcomes/Impacts
By the end of Y2, we expect to have a robust, fully functioning system.
Metrics:
Assessment is an important part of Y2. By mid Y2, we expect to have a system that is robust enough to be tested by multiple populations (AARP chapters, OSHER groups, senior centers, etc). We will test:
- System robustness: the goal is to have a low error rate and in case of error, to fail gracefully(e.g., “I can’t do that yet”, rather than crashing). To assess the robustness of the system, we will repeatedly call it under more and more complicated scenarios (long directions, change of destination, repeat requests, noisy conditions, etc). We will also develop a suite of automated tests that will ensure the robustness of various system components.
- Efficiency. The system should be able to satisfy the user’s request in a correct and timely manner. We will evaluate the efficiency by asking subjects: first to rate (Likert scale) the system as to whether it gave them the correct information; second to rate how easy it was to use the system. The combination of these two measures can give us a measure of how efficient the system is.
- Effectiveness. The system should provide value to the user and should effectively communicate directions. We can evaluate the effectiveness of the system by giving subjects scenarios to follow that request directions between two specific locations, and asking them to repeat the directions after the call is finished.
- User willingness to use the system again. We will ask subjects to rate whether they would call the system again (and if not, what would they use to get the same information). We will also keep track of real user calls by measuring call-back percentages. If the user calls a system multiple times, it indicates that the system is providing some value to them.
If we have time, we may also make two versions of the agent, one for Yinzers and the other for non-Pittsburghers.
The result of this assessment will enable us to start to advertise the system to the general public in Pittsburgh and in turn to decide how to find other partners. By the end of Y2, we will determine whether GetGoing could be spun off.
Expected Outputs
TRID
Individuals Involved
Email |
Name |
Affiliation |
Role |
Position |
max@cs.cmu.edu |
Eskenazi, Maxine |
LTI |
PI |
Faculty - Research/Systems |
yulanf@andrew.cmu.edu |
Feng, Yulan |
LTI |
Other |
Student - Masters |
yulanf@andrew.cmu.edu |
Feng, Yulan |
LTI |
Other |
Student - Masters |
mehrishikib@gmail.com |
Mehri, Shikib |
LTI |
Other |
Student - Masters |
Budget
Amount of UTC Funds Awarded
$100000.00
Total Project Budget (from all funding sources)
$126032.00
Documents
Type |
Name |
Uploaded |
Data Management Plan |
Get_Going_Data_Management_Plan_2019.docx |
March 12, 2019, 9:21 a.m. |
Presentation |
2019_-_Mobility21.pptx |
March 12, 2019, 9:24 a.m. |
Publication |
CMU GetGoing: An Understandable and Memorable Dialog System for Seniors |
Sept. 26, 2019, 9:51 a.m. |
Progress Report |
286_Progress_Report_2019-09-30 |
Sept. 26, 2019, 9:52 a.m. |
Presentation |
GetGoing presentation |
March 25, 2020, 6:09 a.m. |
Progress Report |
286_Progress_Report_2020-03-30 |
March 25, 2020, 6:10 a.m. |
Final Report |
Final_Report_-_286.pdf |
July 7, 2020, 12:56 p.m. |
Presentation |
GetGoing_Presentation_1.pptx |
July 8, 2020, 9:34 a.m. |
Match Sources
No match sources!
Partners
Name |
Type |
AARP |
Deployment Partner Deployment Partner |
OSHER |
Deployment Partner Deployment Partner |