iPhone photos - how does it group 'related' photos/videos together Watch
Announcements
Page 1 of 1
Skip to page:
Hey there
So I had been to 2 concerts this week. And was checking out my videos from last night. Then I scrolled down and it showed me 4 thumbnails each related to a different time/date since I last synced my phone. It came under 'related' and basically all 4 of these groups of images/videos are of concerts.
How the heck did it know to group these photos together?
Is it because they all share a similar burst of lighting in the videos? Or there's loads of figures in the shot? Or there's a lot of movement in the clips?
I'm just really curious cuz it's quite cool.
So I had been to 2 concerts this week. And was checking out my videos from last night. Then I scrolled down and it showed me 4 thumbnails each related to a different time/date since I last synced my phone. It came under 'related' and basically all 4 of these groups of images/videos are of concerts.
How the heck did it know to group these photos together?
Is it because they all share a similar burst of lighting in the videos? Or there's loads of figures in the shot? Or there's a lot of movement in the clips?
I'm just really curious cuz it's quite cool.
0
reply
Report
#2
Yeah I was wondering about this as well! I think it must somehow know what kind of shapes/colours are in each picture and if it's people/scenery/an event and group them together based on this- quite scary but clever!
1
reply
Gosh I can't wait to get to grips with programming so I can move into Machine Learning !!!!! It's actually so exciting.
0
reply
Report
#5
(Original post by 21ForEva)
Gosh I can't wait to get to grips with programming so I can move into Machine Learning !!!!! It's actually so exciting.
Gosh I can't wait to get to grips with programming so I can move into Machine Learning !!!!! It's actually so exciting.
0
reply
Thanks a lot for the advice dude
-- it sounds like a hella exciting side project. I'll research any Github projects to see if someone has started off with something similar to image analysis.

0
reply
Report
#7
The apps on your phone almost certainly use something like Google Vision to analyse the image - you can read about how it works here: https://cloud.google.com/vision/
Or it might use Amazon Rekognition, which does a very similar thing: https://aws.amazon.com/rekognition/
(Or since it's an iPhone, then Apple Vision - https://developer.apple.com/documentation/vision )
Although the main reason these work so well is down to the unimaginably vast quantities of images and data which companies like Google and Amazon have got stored in their databases - Apple/Google/etc. have hundreds of millions of images, each with tonnes of tags/categories/metadata about what they depict and contain, so they've been able to train all of their statistical models to a really high degree of accuracy.
Luckily, you wouldn't need to do any of that in order to write an app which can do all of this because you can just use the Google/Apple Vision or Amazon Rekognition APIs yourself, then build an app around those. These kinds of AI/Machine Learning services are actually a lot more accessible than you might think; not just for images, but also for audio/speech, etc. For example, Amazon Echo has got an API which lets you build your own Alexa skills without too much effort.
Or it might use Amazon Rekognition, which does a very similar thing: https://aws.amazon.com/rekognition/
(Or since it's an iPhone, then Apple Vision - https://developer.apple.com/documentation/vision )
Although the main reason these work so well is down to the unimaginably vast quantities of images and data which companies like Google and Amazon have got stored in their databases - Apple/Google/etc. have hundreds of millions of images, each with tonnes of tags/categories/metadata about what they depict and contain, so they've been able to train all of their statistical models to a really high degree of accuracy.
Luckily, you wouldn't need to do any of that in order to write an app which can do all of this because you can just use the Google/Apple Vision or Amazon Rekognition APIs yourself, then build an app around those. These kinds of AI/Machine Learning services are actually a lot more accessible than you might think; not just for images, but also for audio/speech, etc. For example, Amazon Echo has got an API which lets you build your own Alexa skills without too much effort.
Last edited by winterscoming; 1 week ago
1
reply
(Original post by winterscoming)
The apps on your phone almost certainly use something like Google Vision to analyse the image - you can read about how it works here: https://cloud.google.com/vision/
Or it might use Amazon Rekognition, which does a very similar thing: https://aws.amazon.com/rekognition/
(Or since it's an iPhone, then Apple Vision - https://developer.apple.com/documentation/vision )
Although the main reason these work so well is down to the unimaginably vast quantities of images and data which companies like Google and Amazon have got stored in their databases - Apple/Google/etc. have hundreds of millions of images, each with tonnes of tags/categories/metadata about what they depict and contain, so they've been able to train all of their statistical models to a really high degree of accuracy.
Luckily, you wouldn't need to do any of that in order to write an app which can do all of this because you can just use the Google/Apple Vision or Amazon Rekognition APIs yourself, then build an app around those. These kinds of AI/Machine Learning services are actually a lot more accessible than you might think; not just for images, but also for audio/speech, etc. For example, Amazon Echo has got an API which lets you build your own Alexa skills without too much effort.
The apps on your phone almost certainly use something like Google Vision to analyse the image - you can read about how it works here: https://cloud.google.com/vision/
Or it might use Amazon Rekognition, which does a very similar thing: https://aws.amazon.com/rekognition/
(Or since it's an iPhone, then Apple Vision - https://developer.apple.com/documentation/vision )
Although the main reason these work so well is down to the unimaginably vast quantities of images and data which companies like Google and Amazon have got stored in their databases - Apple/Google/etc. have hundreds of millions of images, each with tonnes of tags/categories/metadata about what they depict and contain, so they've been able to train all of their statistical models to a really high degree of accuracy.
Luckily, you wouldn't need to do any of that in order to write an app which can do all of this because you can just use the Google/Apple Vision or Amazon Rekognition APIs yourself, then build an app around those. These kinds of AI/Machine Learning services are actually a lot more accessible than you might think; not just for images, but also for audio/speech, etc. For example, Amazon Echo has got an API which lets you build your own Alexa skills without too much effort.
0
reply
Report
#9
(Original post by 21ForEva)
Mind blown. Completely mind blown. I’m sick right now with the flu, bored out of my head so I will definitely read those articles. Thanks a lot for the info!
Mind blown. Completely mind blown. I’m sick right now with the flu, bored out of my head so I will definitely read those articles. Thanks a lot for the info!

0
reply
(Original post by winterscoming)
You're welcome! I had a look through some of the tutorials and example code they've got on Google Vision API, it all looks fairly beginner-friendly - if you get get a chance to try out some of their tutorials and example code then there's loads of interesting stuff to tinker with. Something like this is great for a spare-time coding project
You're welcome! I had a look through some of the tutorials and example code they've got on Google Vision API, it all looks fairly beginner-friendly - if you get get a chance to try out some of their tutorials and example code then there's loads of interesting stuff to tinker with. Something like this is great for a spare-time coding project

0
reply
https://cloud.google.com/vision/ -- omg so I dragged images of myself and also fav KPOP band posters and wowww I am seriously blown away by how much the API can predict features and properties.... like it can tell if my face is full of joy or sadness. It's maddddddd
0
reply
Report
#12
(Original post by 21ForEva)
Thanks - I will definitely check it out. Sounds way more interesting than the coding projects I have to do in my team at work haha.
Thanks - I will definitely check it out. Sounds way more interesting than the coding projects I have to do in my team at work haha.

(Original post by 21ForEva)
https://cloud.google.com/vision/ -- omg so I dragged images of myself and also fav KPOP band posters and wowww I am seriously blown away by how much the API can predict features and properties.... like it can tell if my face is full of joy or sadness. It's maddddddd
https://cloud.google.com/vision/ -- omg so I dragged images of myself and also fav KPOP band posters and wowww I am seriously blown away by how much the API can predict features and properties.... like it can tell if my face is full of joy or sadness. It's maddddddd
0
reply
X
Page 1 of 1
Skip to page:
Quick Reply
Back
to top
to top