Hello everyone, sorry for the long wait. I had promised a tutorial on how I upscaled SD video from Star Trek: Deep Space Nine. I also promised to get you some more upscaled DS9 footage. You’ll get both if you just keep on reading. But before I begin I want to just thank everyone for the great response to my ‘DS9 Remastering Using Machine Learning’ proof of concept. Whenever my stuff gets mentioned on the big news sites that I read myself, I get a big kick from it. Over the past few weeks, I have gotten a lot of great feedback from a lot of people. Not everyone could see the improvements my machine learning method was able to produce for example.
YouTube did a number on the compression of some videos. The frame-by-frame and the side-by-side video didn’t really show off the changes as well as other formats could have. They showed that the image was crisper, but didn’t convey the improvement in details as well as it could have. The full 5 min video 1080p, that I had to host somewhere else, didn’t come out as 1080p either. I hadn’t noticed that in my rush to get news of this project out the door. With these lessons learned and some improvements in my upscaling program (AI Gigapixel), I think I’ve made a much better trailer this time around.
Hopefully, this can show everyone the promise machine learning has for upscaling old TV shows like DS9. Note: I handled YouTube’s compression algorithm much better this time around, but it still downgrades it slightly. You can download the source file here if you want to see it in its original glory. EDIT: Here’s an image slider with which you can compare a before-after scene yourself. Now on to the tutorial part of this post. So go ahead and download those programs. Keep in mind that extracting an entire episode requires quite a bit of free space on your hard drive. The episode I exported needs 29 GB for the original frames.
If you upscale to 1080p this will add another 72 GB. If you upscale to 4K you will need around 180 GB for the full episode. This is all for Windows, by the way, I don’t know how it would work on Mac or Linux. You will use FFmpeg to extract the individual frames of the source video. This is the tutorial I used, which tells you about more options. 1. Get your source video and put it in a folder you can easily work from. 2. Put ffmpeg.exe in the folder you also put your source video into. 4. In cmd, navigate to the work folder. Work. The part after cd depends on where your work folder is of course. Now you will give FFmpeg the command to export the frames. What does this all mean?
1. sacrificeofangels.mkv is the source file in my case. Change it to the name and extension of your source video file. 2. soa%04d tells FFmpeg how to name the frames (soa0001.png, soa0002, etc.). Soa is an abbreviation of the episode title, which helped me keep multiple video’s worths of extracted frames organized. 3. .png is the extension of the exported file. PNG is bigger than JPG, but JPG creates extra artifacts that the uspcaling program can’t handle very well. So the lossless PNG format is preferred. FFmpeg compilation information. Personal preference. When you press enter on the filled in command, FFmpeg will get to work.
This will take some time. Do something else in the meantime. You will use AI Gigapixel to upscale the frames from the SD original. There’s a lot of things you can with Gigapixel, but I will focus on the settings that I used to upscale my DS9 footage. The program is very simple to use, so I suggest you just try out all the different settings on some photos to see what the effects of everything are. The program has been refined a lot over the past months, so all official tutorials are out-of-date. Trial-and-error is the quickest way to get a hang of this software.
Once you’ve installed the program, open it up. Drag one of the exported frames to middle of the program’s window. You will then be shown something that looks like the screenshot below. 1. You can base the upscale’s resolution by scale, width or height. 2. I want to make the upscaled image have the height of 4K footage, so I input 2160 pixels. Full HD has a height of 1080 pixels. 3. Suppress Noise and Remove Blur can improve the quality of the end result. This really depends on the project. Just try out everything to see what gives the best result for your project.
I chose Low and Low, as it gave just a bit of smoothing/noise removal without going overboard. 4. In my workflow I saved to a Custom Folder. This allows you to export your upscaled images to a different folder (I have one for original frames and upscaled frames). Ignore Prefix and Suffix. 5. For video it’s best to output in JPG format. It is smaller and results in a fast upscale process. Since the original is PNG, you will need to choose Convert File Format: Yes. The few artifacts created by the JPG format at Maximum Quality won’t be noticed when you later convert the frames to a video.
Keep Color Profile to Yes, as it’s faster. Once you have a working setup, you can start upscaling footage. Select a few hundred frames when first getting your footing with this whole workflow and drag and drop them in the window. Upscaling an entire episode can take a long time, so start small. How many frames you can upscale in one go depends on your PC’s specs. It can be 1000s, 10000s, etc. You’ll just have to find out for yourself. With Audacity you will rip the audio from the source file. Open up the video file in Audacity. The file type you save it as depends on the program you use to create a video from all the individual frames.
For my project, I used VirtualDub, which can’t take all audio types. In this case, I use WAV 16-bit PCM. Save it and you’re done. Note: the source file for the DS9 video that I used had 5.1 surround sound. Basically, there are multiple audio layers. I have not yet figured out how to get the sound as surround sound into a format that VirtualDub can work with. Saving it as WAV16-bit PCM will collapse this into a mono format. It’s okay for test purposes, but not for a final product. Let me know if you know how to get 5.1 surround into VirtualDub.
Everything will be brought together in VirtualDub. 1. Go to the Video menu. You will find Frame Rate and Compression there. 2. Get the x264 encoder. If you don’t have it, download it here. 3. Here you can determine the quality. I found the settings as displayed the most appropriate. 4. Go to the Frame Rate menu. Set the frame rate in the two fields to the frame rate of the source video. For DS9 this was 29.970628 (a common frame rate). 5. Go to the Audio menu. We need to make sure Interleaving is set up correctly. 6. In the Interleaving menu, set everything as displayed above.
7. Make sure Sync to Audio is on in the Options menu. Save as AVI. Save the file and let it render. You can uncheck Show Input Video, Show Input Video and set the processing thread priority to a higher priority to make the video export faster (at the cost of having fewer resources available for other programs). This is how I did it. No doubt there are alternatives/improvements to the above workflow. If you have knowledge about these kinds of programs, please leave your suggestions in the comments below. This blog post marks the end of this small side-project. While I would love to release full episodes, this is just not legally possible. While entire episodes might be out of the question, now that I’ve shared my workflow with the rest of the world it should be easier for others to try these methods out on clips and segments themselves. Go out and try it out yourself on DS9s battles of the Dominion War. Or go and see what this could do with Babylon 5 or other great shows that are still stuck in the SD era.
Nevertheless, although both of them are very powerful and provide non-linear model fitting to the training data, data scientist still need to carefully create features in order to achieve good performance. At the same time, computer scientists has revisited the use of many layers Neural Network in doing these human mimic tasks. This give a new birth to DNN (Deep Neural Network) and provide a significant breakthrough in image classification and speech recognition tasks. The major difference of DNN is that you can feed the raw signals (e.g. the RGB pixel value) directly into DNN without creating any domain specific input features. Through many layers of neurons (hence it is called “deep” neural network), DNN can “automatically” generate the appropriate features through each layer and finally provide a very good prediction.
This saves significantly the “feature engineering” effort, a major bottleneck done by the data scientists. The whole spectrum is called Deep Learning, which is catching the whole machine learning community’s attention today. Another key component is about how to mimic a person (or animal) learn. Imagine the very natural animal behavior of perceive/act/reward cycle. A person or animal will first understand the environment by sensing what “state” he is in. Based on that, he will pick an “action” which brings him to another “state”. Then he will receive a “reward”. The cycle repeats until he dies. This way of learning (called “Reinforcement Learning”) is quite different from the “curve fitting” approaches of traditional supervised machine learning approach. In particular, learning in RL is very fast because every new feedback (such as perform an action and receive a reward) is sent immediately to influence subsequent decisions. Reinforcement Learning has gain tremendous success in self-driving cars as well as AlphaGO (Chess Playing Robot). Reinforcement Learning also provides a smooth integration between “Prediction” and “Optimization” because it maintains a belief of current state and possible transition probabilities when taking different actions, and then make decisions which action can lead to the best outcome. Compare to the classical ML Technique, DL provide a more powerful prediction model that usually produce good prediction accuracy. Compare to the classical Optimization model using LP, RL provide a much faster learning mechanism and also more adaptive to change of the environment.
Experienced Magento e-commerce developer can provide you with a uniquely tailored and highly functional Magento shopping cart and website solutions. They can even integrate third party payment gateway apps into your store to make the entire process of purchase simple and trouble free. This amazing feature help vendors to grab the attention of more and more customers and increases the opportunity of success online. Magento is constantly evolving, new and advanced versions of it are frequently introduce to offer the users superior user experience. The platform being highly flexible and scalable, the store designed are easy to customize. Depending upon the need of your business, the developers at any point of time can alter the look or functionality so as to make it a more profitable venture for your business.
Magento shopping carts designed comes loaded with functionalities like order management, multilingual, product browsing, effective management of the catalog, multi-currency support, single page check out and more. Further, all the carts and websites powered by Magento are search engine optimized, ensuring better ranking on search engines. Thus, you can cherish enhanced visibility and volumes of quality traffic. This way you can greatly expand your customer foundation and at the same time boost your sales channel. If you want to gain a competitive edge over your competitors, it is best to outsource your Magento e-commerce development need to a professional and reputed outsourcing company. Most web development companies have a resource pool of highly skilled and talented Magento experts on board. The developers having sound understanding about the nuances of the platform make the most out of it to deliver client a solution that best satisfies the requirements. The solution offered by the professionals is sure to exceed your expectations and that too within the stipulated time frame. As a leading Magento development company, we do not hesitate to walk that extra mile for you. Hire Magento developer from us to add extra to your online presence.
When I was an undergrad, probably my favorite CS class I took was algorithms. I liked it (a) because my background was math so it was the closest match to what I knew and (b) because even though it was “theory,” a lot of the stuff we learned was really relevant. Over time, it seemed like the area had distilled worthwhile algorithms from interesting-in-theory-but-you’ll-never-actually use algorithms. In fact, I think this is a large part of why most undergraduate CS degrees today require a course in algorithms. You have these very nice, clearly defined statements, and very elegant solutions to those statements that in most cases (at the UG level) are known to be optimal.
Fast forward N years. My claim today—and I’m speaking really as an NLP person, which is how I self-identify—is that machine learning is the new core. Everything that algorithms was to computer science 15 years ago, machine learning is today. That’s not to say it won’t move in another 10 years, but that’s how I see it. For the most part, algorithms (especially as taught at th UG level) is the study of one thing: Given a perfect input, how do I most efficiently compute the optimal output. The problem is the “perfect input” part. Even within machine learning you see this effect.
Lots of numerical analysis people have worked on good algorithms for getting that last little bit of precision out of optimization algorithms. Model specification, parameter tuning, features, and data matter infinitely more than that last little bit of precision. In some fields, for instance, scientific computing, that last little bit of precision may matter. Let’s play a thought game. Say you’re an UG CS major. You graduate and get a job in CS (not grad school). Which are you more likely to use: (1) a weighted cost flow algorithm or (2) a perceptron/decision tree? Clearly I think the answer is (2). And I loved flow algorithms when I was an undergrad and have actually spent since 2006 trying to figure out how I can use them for a problem I want to solve. I would actually go further.
Suppose you have a problem whose inputs are ill-specified (as they always are when dealing with data), and whose structure actually does look like a flow problem. There are two CS students trying to solve this problem. Akiko knows about machine learning but not flows; Bob knows about flows but not machine learning. Bob tries to massage his data by hand into the input to an optimal flow algorithm, and then solves it exactly. Akiko uses machine learning to get good edge weights and hacks together some greedy algorithm for flows, not even knowing it’s called a flow. Who’s solution works better? I would put almost any amount of money on Akiko.
Full disclosure: those who know about my research in structured prediction will recognize this as a recurring theme in my own research agenda: fancy algorithms always lose to better models. There’s another big difference between N years ago and today: almost every algorithm you could possibly care about (or learn about as an UG) is implemented in a library for any reasonable programming language. Okay, so now I’ve convinced myself that we should yank algorithms out as an UG requirement and replace it with machine learning. But wait, I can hear my colleagues yelling, taking algorithms isn’t about learning algorithms: it’s about learning how to think!
But that’s also what I think is great about machine learning: the distance between theory and algorithms is actually usually quite small (I try to get this across at various points in CiML, to varying degrees of success). If the only point of an algorithms class (I’ve heard exactly this argument made about automata theory, for instance) is to teach students how to think, I think we could do much better. Okay, so I’ve thrown down the gauntlet. Someone should come smack me with theirs :P! I think I probably wrote badly and as a result my main point got lost. I’ll try to restate it here briefly and then I’ll edit the main post. Main point: I feel like for 15 years, algorithms has been at the heart of most of what computer science does. I feel like that coveted position has now changed to machine learning or, more generically, statistical reasoning.
Seth Roberts who seems to have a found a connection between watching faces and his mood 36 hours later. I think Seth’s experiments are very englihting in as much as they provide a larger view of how much data needs to be used to make sense of them. An electronic scale provides you your weight right now and does not need too much thinking. In fact, it makes the user quite powerless. A 36 hours delay requires a combination of machine learning and how the “thing” communicate with its owner. This remind me of an aspect of robotics that is sometimes missing. Back when we were building an autonomous car to be fielded in DARPA’s grand challenge, we needed to have a rapid operational feedback between the algorithm being trained and what the driver was doing. Since the driver couldn’t watch the computer monitor at the same time as the road, we enabled the algorithm to “talk” to the driver.
This course consists of videos and programming exercises to teach you about machine learning. The exercises are designed to give you hands-on, practical experience for getting these algorithms to work. To get the most out of this course, you should watch the videos and complete the exercises in the order in which they are listed. This first exercise will give you practice with linear regression. Supervised learning problem In this problem, you’ll implement linear regression using gradient descent. You should see a series of data points similar to the figure below. Before starting gradient descent, we need to add the intercept term to every example.
From this point on, you will need to remember that the age values from your training data are actually in the second column of x. This will be important when plotting your results later. Linear regression Now, we will implement linear regression for this problem. 1. Implement gradient descent using a learning rate of . Since Matlab/Octave and Octave index vectors starting from 1 rather than 0, you’ll probably use theta(1) and theta(2) in Matlab/Octave to represent and . Initialize the parameters to (i.e., ), and run one iteration of gradient descent from this initial starting point. Record the value of of and that you get after this first iteration.
2. Continue running gradient descent for more iterations until converges. After convergence, record the final values of and that you get. When you have found , plot the straight line fit from your algorithm on the same graph as your training data. Note that for most machine learning problems, is very high dimensional, so we don’t be able to plot . But since in this example we have only one feature, being able to plot this gives a nice sanity-check on our result. 3. Finally, we’d like to make some predictions using the learned hypothesis. You should get a figure similar to the following. If you are using Matlab/Octave, you can use the orbit tool to view this plot from different viewpoints. What is the relationship between this 3D surface and the value of and that your implementation of gradient descent had found?
Solutions After you have completed the exercises above, please refer to the solutions below and check that your implementation and your answers are correct. In a case where your implementation does not result in the same parameters/phenomena as described below, debug your solution until you manage to replicate the same effect as our implementation. A complete m-file implementation of the solutions can be found here. Run this m-file in Matlab/Octave to produce all the solutions and their corresponding graphs. If your answer does not exactly match this solution, you may have implemented something wrong. Did you get the correct , but the wrong answer for ?
If this happened, you probably updated the terms sequentially, that is, you first updated , plugged that value back into , and then updated . Remember that you should not be basing your calculations on any intermediate values of that you would get this way. If you run gradient descent in MATLAB for 1500 iterations at a learning rate of 0.07, you should see these exact numbers for theta. If used fewer iterations, your answer should not differ by more than 0.01, or you probably did not iterate enough. This is close to convergence, but theta can still get closer to the exact value if you run gradient descent some more.