Friday, May 14, 2010

Operation Reboot: IT Professionals Become Computer Science Teachers - US News and World Report

Operation Reboot: IT Professionals Become Computer Science Teachers - US News and World Report

This is an excellent means of getting better quality computer science education into high schools: combine the recently laid off, yet well qualified IT professional with a high school teacher that lacks proper IT training. The result: high school graduates that are (hopefully) better prepared for computer science right from their freshman year in college, and an employed IT worker. Perhaps these students will be able to take the AP exam in computer science? It's hard to say. Based on my experiences teaching introductory computer science, I am seeing far too many students that can not handle the CS-1 course out of high school. If the U.S. is going to regain a place as a dominant force in computer science, we need to push the limits and consider new approaches to increase the readiness of students entering college with interest in computer science. This is one way that can improve their chances of success, and hopefully help them understand what computer science entails.

OK, admittedly, my views are often idealistic and neglect the realism that often comes with teaching any topic, let alone computer science. As one IT professional in this article said,

..."it's now a very different lifestyle, .. School is different." Smith, who teaches in an inner-city Atlanta high school, agrees. "It's been a little rough," he says. "Many of these students bring different issues into the classroom from day to day. But I'm committed to doing this."

Ah, yes. The reality of teaching hits every professor, teacher, or instructor at some point early in their career. I've spoken with many colleagues about their early interest in becoming a professor. We expect that we will have students like us, of course! We will have students that want to learn! We are often self-labeled geeks that love learning all about the latest trends and research in our favorite fields in computer science. We thrive on the dynamic aspect of our field, and we hope, and often expect that we will be able to impart that excitement to our students! Some of us know we are "called" to impart knowledge and have a genuine gift and do it quite effectively and naturally. Even then, most consciously choose higher education over high school education because we want to work with students that choose to be in our class, not those that are forced for whatever reason, usually resulting in highly disengaged students exhibiting an effort that clearly indicates that they would rather be anywhere else but in your class! Of course, reality soon sets in, usually after the first class. The flood gate of excuses, stories and sheer laziness opens up! Sometimes, students have genuine, real issues and problems that affect their success. But, more often than not, the excuses are lame. (e.g.,  the sun was out today, forgot to set the alarm, cat stuck up in tree, car died, muffler fell off in the driveway, drunk, sports commitments, or worse yet... drunk sporting events, team practice, fishing club practice (huh???), concerts that needed to be attended, frat pledging, updating Facebook is more important, etc.) Regardless, these are realities, no matter what caliber college or university you might teach at someday. Some colleges are definitely worse than others. (When you interview, be sure to talk to several students. Try to gain some sense of the type of students you may be dealing with someday.)

Despite the realities, teaching computer science is still one of the best jobs I could ever hope to have. I view the sobering realities as challenges that drive me to ensure I maintain a high standard of expectations for classes, without exceptions, and I make damned sure that those expectations are clearly understood and conveyed the very first class. I also make sure they know what they should expect of me. Effort needs to be put forth from both sides. Only then does learning take place. Yeah, the realities that come with today's students drive me a little nuts at times.

Perhaps if we improve our high school education, and we release the expectation in this country that every person must achieve a college education for success, then perhaps we'll slowly get back to having the majority of students being... students.

Tuesday, May 11, 2010

Scientists try to bring order to a glut of drug data - Apr. 29, 2010

Scientists try to bring order to a glut of drug data - Apr. 29, 2010

Anyone that has done any work in bioinformatics will appreciate this article. Biological data is continuing to be added to hundreds (if not thousands) of databases world wide. Unfortunately, with the exception of only a few standards that have survived over the years, there is very little compatibility among the various repositories. Researchers (or more often, their students) spend an enormous amount of time acquiring the data they need to perform their analyses; they often need to write various scripts to get the data from various repositories in the format they need for their own work. Since there is no standard that is adhered to, you occasionally get the owners of these repositories randomly changing their own formatting or constraints to suit their own needs. This is valuable time lost that could be spent doing actual research.


Wednesday, May 05, 2010

N.Y. bomb plot highlights limitations of data mining - Computerworld

N.Y. bomb plot highlights limitations of data mining - Computerworld

The field of data mining is a huge interest of mine. Like most researchers in computational methods, we always keep our ears and eyes open for opportunities to apply these algorithms. Let us consider the use of data mining to aid in identifying terrorists. I am a little surprised that the government bought into data mining as a key solution for this task. (OK, let's be honest -- I'm really not surprised, but that is for reasons unrelated to this article.) For the most part, I agree with what Bruce Schneier said. You need a well defined signal, profile, or pattern in order to extract the item of interest from the data.

The key to successful data mining is, of course, successful learning. The question becomes, how can a computer learn patterns of interest from exorbitant amounts of data, despite the obvious lack of significant examples that enable these algorithms to learn how to distinguish terrorist activity from the overabundant supply of non-terrorist activity? There are (thankfully) very few examples presented each year. The most these algorithms can do is try to learn a model for typical behavior of the average citizen, and flag those that are exhibiting some pattern of behavior that does not fit the model. Another common approach is to provide fictitious examples that "domain experts" think would fit terrorist activity. NSA is likely providing examples of what they believe represent patterns that led to known activity in the past. Whatever the method, the aim of a terrorist is to follow the model, right? They are only successful if they remain completely inconspicuous. Thus, the problem really does represent the equivalent of searching for a "needle in a haystack" at best. You will undoubtly deal with a high rate of false positives, which has an enormous high cost. Or, you adjust your algorithm to turn the sensitivity down. But, then you deal with the potential high cost of a false negative, such as what happened this past week. Where is the "sweet" spot here? (That is the job of risk management!) This is a very difficult problem, indeed.

I don't think data mining should be disregarded as a potentially useful tool in this field. If nothing else, this represents an area where researchers still have much to learn, so to speak.

On a related note, I'm intrigued as to what these nearly 200 data mining programs are that the fed has invested in. As we fall into more of a digital world all around us, data will continue to pile up higher and deeper. There is certainly no dearth of opportunities for data mining! Despite this, I think I'll stick with bioinformatics for the time. One single strand of DNA provides plenty of interesting potential for various algorithms. Consider the opportunities of sifting for informative blocks in DNA over hundreds and thousands of DNA sequences from thousands of species... sounds like a needle in a haystack again!