
oh yeah thanks pool and so as has already been said hi my name is Lena and just a quick introduction even though it's in the leaflet and I'm a third-year in these sites students in up arrestor as you can probably tell I'm from Wales and and I have been spending some time looking at and languages and also looking at how those translate uses or more technology technological aspects technological revolution so um the idea of how this came about it was that I am - my first year I lived in Welsh speaking only halls and most of my friends were studying things like history like Welsh but I was sitting master computer science and I got an
Amazon Elastic for my birthday and one of my friends went off Leena don't worry as long as we all speak Welsh no one you know Alexa can't listen to us you know Amazon can't listen to us whatever and I was going yeah okay yeah I can I can believe that but then I started thinking it's a matter of time before you know before we really open it up to the different languages that you know these are these devices get more and more influenced by languages so I was hoping I was going well how long will it actually take for this happen so that we can have most languages in the world and used on our you know home
devices and and things like that and my language of Welsh is an even weirder language because it's got different dialects and dialects so interesting to translate when it comes to technology we don't always think about you know different dialects but it is something that is weird an application the other influence I had was a book I read a few years back and it's called India though up it's in Welsh I don't know if it's ever been translated into English and it's called so it translates to the last day and the premise of the book is it's society of a 16 year old I think it was at the time I got a kid called mal a map
and he basically is in this dystopian society and the in this society submit 1984 style people in the society are slowly being turned into robots and they're slowly making this transition as Henry counting the fact that because he's writing in Welsh the robots can't get to him they can't understand him until you get to the end of the book and you realize that he's been converted to and he starts spouting gibberish which is the way in the book that we realized that people have been turned into robots so a bit of backstory on the writer because it is a little bit relevant to the tall and it's a guy called oh I know I the guy who's
pictured down there and he was a nuclear scientist in the fifties so he grew up in North Wales and played curly he went to university in Bangor he then went to sneaker science then became a lecturer in Bangla Robert Oscar so he was a scientist but he was also a linguist was a great proponent of the Welsh language who's the founder of travel Drive which was like a magazine about you know Welsh language matters and a very instrumental factor of what is today known as can paint the city I scumbag so the Welsh language society who you might see on the news for painting things dickering things and being a little bit revolutionary and a little
bit out there but it's to get the point across that we want our language to be valued he also developed the idea of frog and ride so the areas of Wales that were very well speaking and was a face prophet who Alan who wrote the tabla cycles had this prophet of the language awakening in Wales I like to think of Owen Owen as someone who was at ahead of his time and he once came out with this quote here which I've written down because it's long to say but he basically started contemplating the role that computers would play in the development of the human race and he said that we would turn as as humans of
the education system just a powder click powder face we turn from humans an education system not just being people who are remembering information that people who could use that information because the computers would then be the memory so that got me thinking and you know the the idea of you know humans becoming fact-finder's rather than just storage methods for information and that would really change the way that you know education was wrapped and this is you know he said this back in the 70s actually 76 about the time you wrote the book so I think it was quite a prophecy sort of style as to how we run things now so yeah so then you know there was I
got to university and I felt cool and started listening to music and I do a lot of work with the Welsh language music industry and this album came out by a singer who's originally from here in Cardiff gwenna and she wrote this album called Indianola based around this book that oh I know in and written and it got me thinking once again about the language in about technology so here are some linux out of the album that i've translated and it's sort of this idea that machines collect data and estimate things and would change our fate within the century and i loved this idea so i thought okay well how long is it before machines can start understanding you
know welsh or other language is ready sometimes what I considered I said we've also minority languages how long will that take until you know we do have to start worrying about the security what we're saying in our homes they have a weird one so I continued so in terms of people who speak the Welsh language there are two facts to consider here so coming back to the idea of if Prague and right that's what's on the left side here the areas of Wales that are more Welsh speaking so I live here in a Podesta but I learned Welsh up here so my Welsh is quite northern but I have run into friends from University and that's the
second map at the dialects in Wales and people got all Welsh time to save me northerners hate the southerners or anything like that but actually the hansom is quite a few dialects is in fact there's five dialects four of which are in Wales and one of which is in Argentina so the language your expression is correct the language has evolved and the way that the dialect has changed is a very interesting way to consider when you write to any to technology because it's not just a case of recognizing a word because the word can change from dialect to dialect as can the sentence structure so it's a question of semantics so in terms of implementing you know a language
recognition on an Amazon Alexa or on a Google home it's quite a tough problem so those are the two sides of the problem that I was looking to consider and there is a bit of competitiveness over the dialects which you know doesn't help so coming to the idea of home devices the first one that I looked at was the Google home which hit the news you know has been hitting the news for peers and how many language it supports so you know it's quite a few languages but even within the languages it supports different dialects so you know for example for the French which I have used on the Google home I'm also a French speaker you can speak the katifaq
epic but dialects to Google home and it will figure out what you're saying even though the words are different so I was thinking how does it do that and actually when I looked into it was a very complex set of them well it had been trained through neural networks and it was basically taking a word and it stopped seeing it associating it to another words so this is all the things that Google holy hasn't you know all the languages that is implemented and there's quite a good spread but you know there's really not that many when you think about it I I know that my dad got a Google home and he's from Lebanon and
he started yelling in Arabic and I don't even want to explain what came out of the Google home because it was just a bit confused so then we move on to the other big player in the game so the Alexa and I have an echo dot and she was endlessly confused when I was asking her to play Welsh music I'm not even going to tell you what she was trying to play at me but yes seven languages in total again a range of different dialects and Amazon have been trying to implement even more languages on here but they're a bit behind schedule so these are the languages as of 2018 I make sets and goals with 2019 but
they're about less than halfway to those goals of implementing the other languages so it's a bit of a bit of a slow game in terms of trying to make it more accessible to other people so this brings me to the idea of machine translations as like okay well if I want to look at the aspects of security where I've got to try and implement this myself best so which in translation obviously deals with semantic steals with more logical differences grammatical complexities you know some languages are a lot more grammatical than others you look at languages like French and even Welsh this is English I remember when I was learning English I had no idea what was right until someone
went it's fine it just sounds right whatever that meant to me as a learner so I put some examples of Welsh phrases that have differences between the dialects just to give you an example of the fact that you have difference in phrase difference in pronunciation is sometimes difference in words so a simple thing like asking for a cup of tea and I remember this meeting one of my closest friends was from South Wales he asked me for the first time he's like ah hey mind if split I was like what does that mean but I'd never heard that before and that's in the same country you know maybe about 30 hours drive in-between and you've got a completely
different centers so that's the first thing you know you've got the difference in recovery so Paneth and dish squads they mean tea both of them they sound completely different but they mean tea and then you've got milk is the other you know the other one so the ongoing issue I'm like this is Clara I would say cleverest my friends would say flight and there is sort of like a cutoff line they've tried to map as to how the language has changed but trying to implement this on an you know on an Alexa took time because it was a case of connecting those words together and trying to think of all of the different dialects and all the different words
they use this is only North and South like I said there are five different dialects sometimes you can get five different words for the same thing and they're not synonyms because different people would use them in different sentence constructions and the other one you know the other idea is the fact that phrases are pronounced mildly differently so for example to say sorry you know in North world you say my Blue Canoe but in the southeast say my true happy or Bambi so it's slightly different if you've got you know device it you know it can you slip up a bit so that was what I looked at you know what I had to take into account and I spent
time talking to many different people about this topic including my university lecturers and they were a bit confused as to why it hasn't been done before and why I was taking so much time and being so nitpicky but in terms of the security argument I argued that a bit like in the book if you go up the whole idea was that the character mark was writing in Welsh so that the machines wouldn't understand him and I really wanted to know how long it would be before languages like Welsh would be understood by home devices so I tried it myself way okay so the methods that I used and so I went for a sort of standard encoder and
decoder architecture for the actual neural network that I tried to implement and I had some very interesting math and it didn't fit on the slide so I decided to put videos instead and the idea is that you have a saw sentence so like the little phrase that I was saying because you wouldn't really say a lot even have a long chat with a home device so that was something I took into account it's usually you know Alexa can you play this song Alexa can you set a timer you know can you order some shopping it's usually a set sentence that people say but it doesn't have to be a set sentence it's just not going to be a very long chat I
don't know how you are with your home devices if you have home devices I don't tend to chat to mine for that long so the idea is you have a saw sentence but with a sentence construction that you know so you know where a verb could be nowhere and now you could be you know it doesn't you know and descriptive adjectives you know where those would be in the sentence approximately because grammatically even in English and in many other languages the idea of Construction is about the same no matter what sentence you'll say in so you have a slight idea which works to an extent and you push it through the target so you push it through the with a range of
vector values basically to be able to tell what ito you analyze it word by word what each word is saying what that word could mean and then you put it back together you're like I've put here with the W X Y Zed you put it back together and you've got a sort of machine view of what's what the instruction is that you've asked this machine to do so you say something it gets translated into a way that you can that the machine can understand it and then it goes off and does something so usually in in the home devices the more I looked into this the more I realized that the reason that there a lot of the languages hadn't been
properly implemented was that you could say something for example to an Amazon a lecture in French and it would take the French sentence and do all the it would translate it first into English proceed and do the action in English and if there was an answer required where the machine it would translate it back into French and give you the answer because obviously you're assuming that the speaker speaks French so saying that these machines are understanding us is a little bit false because what they're doing is they're translating what you've told me to do doing something and then translating it back in you know if there's an answer to keep doing it so I was I kept tinkering with this not okay
well I got it to go from Welsh to English back into well should we add to give me an answer the bit that I then tried to test was within all of this scheme of Welsh dialects and stuff like that there was a failed attempt in the 80s to implement a standardized wash language so can I bury living Welsh and the idea was it would be taught in schools it was fun by the language people would speak it it just never took off it just didn't work because there was so much regionalization of the language that I had taken place over years years years that it didn't catch on no one really understood this new tonight so it still
exists there's still books in it and people can understand what it's saying but no one would really speak in that way so okay well if there's a standard version of Welsh a bit like you kind of have a standard version of English what if I use that and just pass the in second I'd bill instead of translating it into English so that at no point in the process of translation of ordering and stuff like that at no point what I have to use English in my Alexa I know that would be a really interesting idea and there had been some previous research done in Swedish so the way that Swedish dialects worked so I got I've
been starting on that and so I thought that just to demonstrate how far I got I had a little video of me trying to talk to my Alexa which I hope will play oh dear I know it's a bit loud up says I'm sorry but if you can hear diastole shaker she fell when I must ossify so I asked it to play this song cha-cha-cha HP which is here you can you know if you continue 80 you can give it a listen so I've gotten it to play things to a decent amount of accuracy so the system works which is great so just in conclusion this is an ongoing project I'm definitely nowhere close to done and
I'm always finding things to play around with but I thought there's the security implications of this issue quite interesting because this all started off with people going well how long will it be until all these machines understand all of these languages so I found this really cool picture of a language tree and it shows you what the different kinds of languages are as the roots they come from and the the links they have because really that is something that we could implement within our sort of the network approach to book when we're implementing it in you know home devices and stuff like that so in terms of my study I'm way down here in the Celtic languages so I have toyed
with the idea of expanding my my work to other things like I'm learning Cornish I know learn a language what you can talk to people is what my mom said but I'm learning there's actually a lot of common roots there so I was thinking why wouldn't I extend and extend the idea to other programming languages where instead of having a network of just the words in Welsh you can have them in different languages and just sort of build it on like that I keep tacking on for any one word with the idea being that eventually I could speak to my lecture in any of the languages and she would just understand what I was saying because it would just
be one word with one meaning which then you'd have different words with that same meaning and just sort of building up that Network now I know it takes time but I'm 21 and I have ages so what is the point of having so I never think I pronounced it what's the point of having more languages I mean why can't everyone just speak English or French or Arabic or Mandarin or any other thing why why do you want to work on languages so for me speaking watch was a big part of who I am and working Wales a big part of who I am but I'm also a computer scientist so I like to play around with things so
it's not just to do it as an experiment and give me extra challenges within the dialects and stuff like that but the point is also to make technology more universally used so I was up seeing a friend in North Wales with his grandmother and I was showing her my Alexa now she speaks very little English she she's always lived in this village past whenever love wails speaks very little English and she she was really trying but when I was talking to her about you know this text she when she thought it was so cool she was like oh my god I can speak to a computer in my language and it was that little horse that you
know motivating I thought okay I've got to keep trying at this project even though I wasn't doing so well at the time got to keep trying so that was the point but there's also a security aspect I think also so in having more languages you're also opening yourself up to more problems I mean like this or in the keynote at all there is the idea is the more open you becomes the more security breaches you could have I think it invites us into weird and wonderful ways of solving problems so that leads me on to the next point if we keep opening up we are eventually going to modernize language so another phrase I get when I'm outside of Wales
is why are you learning at that language it's very hurtful but you get that so how can this modernize the language so like I said it make it more user that's something but we also have this scheming way also can make 2015 we want to get a million Welsh speakers by 2050 and whether we do that or not is neither here nor there for this project but it's about making a lingua JH modern and more use and if you have the opportunity in the house to speak it or in the office you know there are there a lot of people using home devices in the office to speak the language I think that is more
important and in sort of building up and building people's confidence up I've met a lot of people who come up to me and go why I've been learning Welsh and I can write it I don't have the confidence to speak and I think that that would really build that up not just for the Welsh language but for other languages I know that I've got friends learning Mandarin who haven't been to China yet and they they just go well have the confidence to speak mandarin but i can write it fine and it's that idea of building up you know there's a lot of richness to be had in those languages we've build it up by you know using technology as an upside i
last but not least again the security implications well having a language has adding anything to a system has security implications and i had where my lecturers turned to me and go oh well have you have you you know if you're overwriting certain parts of the system will the Alexus or like i said i was using like their Oh have you considered what that will do to the Alexa be careful you haven't hooked up your credit card back be careful you know anyone could order anything in your house and it's a case of implementing more things on top so because I've been overwriting the original system I've not been using an API I was originally and
then I just started from scratch and start overwriting it there was the idea of setting my own types of security so I've actually whenever I want to order my shopping I've got like a little password thing that I say it's a couple of song lyrics not so much poem lyrics poem phrases I don't know what that calls lines from a poem and that I say and the elect all know it's me so you can play around with it a little bit and have some fun so yeah that was that with Michael any questions thank you thank you sir we have time for questions yes okay just yell how I played it my minecraft yeah Mycroft yeah yeah Nathan
Lee come I forget which company is that [Applause] [Music] yeah it's my dinner it's Mozilla Foundation yeah I think my craft is Mozilla Foundation yes I can remember who brought it over that was was and yes actually and actually I played it in Welsh and I played it in Cornish to practice my Cornish but with Minecraft yeah about Mike Ross yeah sorry no you have my crab-like played with it with no plate sorry not for the games I wish it was that would have been great incidentally you just have both like I think as revenge for yeah no I have I say play like I'm how to look into it um it's pretty well the only issue is in terms
of and this is only an issue in terms of Welsh it's only accounting for two of the dialects when you give it a go only accounts for the northern and southern whereas there's a few others which it having to be accounted for but I think that based on the fact that the main the main differences between alleles either like easier to take into account the differences between the others are more pronunciation and intonation so it's more of a nitch problem which hasn't really been explored yet yeah exactly yeah yeah yeah like I said there's different sort of categories of the dialect so there is a case of broadening out but I know that you can't do it
within the engine I just don't think it's been done yet oh no hello I got a question are you aware of any Welsh indeed speaking group forcing dictionaries that are out there yes I can't hate you are you aware of any Welsh based speak who force dictionaries out there at the moment haha I'm really sorry
[Music] not really no I'm not aware of any no I've even looking this I don't know I haven't started looking for some but it was actually when I'm sure they're showing this to my friend yesterday that have you considered this and I happen like I said it's early days but yeah I need to have a better look into it the thing is with watch the infrastructure is not really out there so it's about sort of tinkering with it and a lot of the issue is also with a lot of us because not many of them are intact so not many of them think of this it's not something that's really thought about the tech community can't conversely you
know don't really think of it so it's bridging the two and juggling the two yeah sorry it's just cuz there's a lot of people myself yeah yeah anybody else thank you [Applause]