I'd like to gain a deeper understanding of a very popular, and reportedly quite powerful, branch of machine learning called "Convolutional Neural Nets" or CNNs. They're a topic that I've done some significant reading on in the past, but I've never really jumped in and tried to do anything more complicated than MNIST, which is a bit like "hello world" for image recognition. To get some real hands-on experience, I need a proper project. While I would like to eventually apply what I learn to my career, this starter project is purely for learning, and thus I can choose whatever I like.
I want the project to be something that's relatively straight-forward to do, yet not something that's been done a million times before. Preferably it would be something genuinely unique and useful, but I'm not going to set the bar quite that high yet. Since one major barrier to using CNNs, and deep learning generally, is the need for a large collection of well-labelled imagery, I'd like to choose an image recognition task for which such a collection already exists, or which can be relatively easily generated. Yet it has to be something that's "interesting" to me, otherwise I know I'll quickly end up shifting focus, otherwise known as getting bored and doing something else.
Last night, I hit upon a concept which I hope will be suitable: identify electric vehicles. Specifically, identify the make and model (and ideally, year) from images pulled from dashcam video.
A few thoughts:
- The consistent viewpoint of dashcam imagery should help make training somewhat simpler.
- Nearly all electric cars (in my area, at least) have HOV stickers on their rear bumper. This sticker may be simple enough to detect that I can do it with traditional CV techniques, allowing me to more easily generate a training set. Additionally, the sticker should make it easier for the CNN based classifier to do its thing.
- I bought myself a dashcam for Xmas. The ability to run the CNN on my own imagery makes it a wee bit more interesting to me.
- Combining electric vehicle detection with geospatial info (pulled from either the dashcam itself, or a cellphone GPS track) could be a fun "stage 2" piece of the project, in which I try to detect regional differences and trends over time.
Hopefully I'll be able to leverage some existing models for detecting cars generally, then specialize to detect EVs specifically. That seems like it would be a good opportunity to first get some experience with using deep learning tools with something that "should" work, before jumping in and trying to do my own thing. Ideally, without encountering the motivation-killing problem of discovering that what you're doing is already such well-trodden ground that it's trivial to apply someone else's work, rather than building anything on your own.
And so, I have my concept. This blog will be just a simple project blog. I don't intend on trying to distill and explain the details of CNNs generally, but will be posting here in order to have a record of what I've done, and notes about what I've learned.