Title

Security Camera Counter.

Summary

We are going to create a program that counts the number of people that appear and their direction of travel in near-real time. We plan on using a NVIDIA GTX 780 GPU along with CUDA to complete our project.

Background

Our project is a program that given a video feed, counts the number of people that appear in the window or frame and creates a vector of movement travel. We need to be able to quickly compute the differences between frames to determine if people have entered or exited the frame and correctly identify people within frames. In order to be able to process video feeds in real time we need to be able to process large images at extremely fast speeds.

Serially, this would be a very difficult problem because the images are very large and we need to compare every pixel to see if the shading has changed, thus indicating someone moving into our field of view. As a result, it would be very difficult to have a serial implementation keep up with a real time feed, as the feed will have already changed before the serial implementation has finished analyzing the current frame.

Thus, this problem can benefit from parallelism similar to the way that we parallelized our circle renderer in the way that we can divide our screen into tiles. This way we can more quickly compare the pixels in the current frame to the previous one. Also, the problem can benefit from parallelism by analyzing different frames in parallel.

The Challenge

This program is not only challenging because we need to be able to analyze frames fast enough to deliver results in near real-time, but also because we hope to provide a reasonably accurate person detector. Thus, there must be some sort of synchronization between our tiles so that we can determine how large of a moving object we are observing.

This program is not only challenging because we need to be able to analyze frames fast enough to deliver results in near real-time, but also because we hope to provide a reasonably accurate person detector. Thus, there must be some sort of synchronization between our tiles so that we can determine how large of a moving object we are observing.

Since we plan on tiling the sections of the image to speedup the computation required, the memory accesses will be high in spatial locality. In other words, if we were to load a section of an image into the cache, then, subsequent memory accesses would likely be cache hits. Also, because we have to load a large amount of data (many images composing the video), we may have a high communication to computation ratio as well.

Even after we analyze all of the tiles, we have dependences between the pixels that have significantly changed their value so we can determine the size of the object that has entered our frame. Furthermore, we are also challenged by determining what change in color value is significant, as lighting and other real world effects will indubitably change pixel values in every frame. In terms of speed, we expect that we will have to efficiently use shared memory in order to be able to analyze our frames fast enough.

After computing multiple tiles of the images, we subsequently need to group them together to detect an entire person and his/her movement - keeping track of the previously known location of a person and creating a vector of his/her direction of movement. One of the key challenges will be to accurately detect a person. Tiling an image and splitting up the work between multiple processors would cause detecting features of a person to be difficult because we would require intercommunication in order to accurately find the location and features of a person. Detecting a person will also be necessary in order to provide accurate statistics and counts of the people within the frame.

Resources

We are starting from scratch. We do not have a book or paper to use as a reference. We need the gates machines, in particular the NVIDIA GTX 780 GPU to complete our project as well as a way to gather a video feed for our project to analyze.

Goals and Deliverables

Plan To Achieve:
We plan to be able to analyze our frames fast enough so that we can detect movement in real time with a very rudimentary area calculation that helps us determine if a human is in the frame or not. Thus, if a mass takes a certain amount of area, we will assume it is a human

Hope To Achieve:
If all goes well, we would love to add more precise human detection. What we mean by this is we will not only look at the total area of the moving body, we will also look for it to be rectangular in shape, attempt to identify a head, etc.

Furthermore, if this goes well, we would also be interested in having project account for camera tilts. This means that we would be able to make our human detection formulas account for the tilts of the camera. This would definitely be a stretch goal and the last thing we would try and do, because even small errors in our equations could result in drastically incorrect results.

Demo:
We plan to run our project on prerecorded security camera footage as well as provide a live demonstration of it in class, where we will ask a student or teacher to walk past the screen. In addition to our live demonstration, we will have speedup graphs that show how much faster our parallel implementation is from the sequential one. If time permits, it we will run the sequential and live version of the program next to each other to show the difference in performance.

Platform

We plan to use CUDA for our project needs. CUDA makes sense for the workload we have chosen because we are trying to process images at a very fast speed, something similar to what we tried to do in the circle renderer.

Schedule

March 29th - April 4th:

  • Finish proposal
  • Begin serial implementation

April 5th - April 11th:

  • Work on serial implementation with a crude area calculation system to determine which objects that pass through our field of view are people.
  • Create benchmarks and graphs/statistics for counting people.

April 12th - April 18th:

  • Finish serial implementation.
  • Begin parallelization by determining how to distribute our frames and an efficient tiling mechanism.

April 19th - April 25th:

  • Finish a correct parallel implementation that implements tiling. This means at this point we should be able to calculate the pixel differences in close to real time.

April 26th - May 2nd:

  • Implement a parallel crude calculation system to determine which objects that pass through our field of view are people.
  • Implement fine tweaks, the correct significance value in change of pixel shading: which pixels change due to the environment, and which ones are changing because objects have entered our field of view.
  • Begin working on the final report and start implementing our hope to achieve goals.

May 3rd - May 9th:

  • Demo testing
  • Finish the final report
  • Work on hope to achieve goals

Revised Schedule

March 29th - April 4th:

  • Finish proposal
  • Begin serial implementation

April 5th - April 11th:

  • Work on serial implementation with a crude area calculation system to determine which objects that pass through our field of view are people.
  • Create benchmarks and graphs/statistics for counting people.

April 12th - April 18th:

  • Finish serial implementation.

April 19th - April 22th:

  • Debug serial version. -Sarah
  • Begin parallelization by determining how to distribute our frames and an efficient tiling mechanism. -Solon

April 22th - April 25th:

  • Begin implementing functions in parallel-Solon and Sarah

April 26th - April 29th:

  • Create watershed version-Sarah
  • Create tiled version of segmentation-Solon

April 30th - May 2nd:

  • Debug-Solon and Sarah
  • Implement fine tweaks, the correct significance value in change of pixel shading: which pixels change due to the environment, and which ones are changing because objects have entered our field of view.-Sarah
  • Begin working on the final report.-Solon

May 3rd - May 9th:

  • Demo testing-Sarah
  • Finish the final report-Solon

Work Completed so far

We have completed most of the functions required to create a working serial implementation. Starting from scratch, we have created functions that can read and convert the a frame into a binary map that shows whether the pixel value has changed significantly. We have also completed a very crude implementation of blob detection, which allows us to count the number of "people" going through our frames.

Goals and Deliverables

Although at this point it is hard to tell because we are slightly behind on creating the parallel implementation of our program, we still believe that we are on track to create a parallelized implementation that will be able to analyze these frames in close to real time.

However, it does not appear like we will be able to reach our "nice to have" goals due to time constraints. The serial implementation took longer than we expected, and as a result, we are about a half week or so behind schedule. Furthermore, we believe that parallelizing our blob detection will take longer than we previously expected. The reason for this is we will be attempting two different parallel algorithms, each with drawbacks and interesting edge cases. Thus, we will need extra time to write and test these two implementations.

Preliminary Results

None at this point.

Resources

We are starting from scratch. We do not have a book or paper to use as a reference. We need the gates machines, in particular the NVIDIA GTX 780 GPU to complete our project as well as a way to gather a video feed for our project to analyze.

Concerns

We are concerned that our serial implementation took longer to implement than we originally expected. As a result, we are about a week behind schedule. We definitely underestimated the difficulty of starting this project from scratch. Looking forward, we are mostly concerned with edge cases that can cause our parallel implementation to run slower than expected. The most difficult portion to parallelize will be segmentation, which we will write different implementations for. Time constraints are the biggest thing that makes completing the project difficult.