|
| Sat, Sep 06th | home | browse | articles | contact | chat | submit | faq | newsletter | about | stats | scoop | 16:47 UTC |
|
login « register « recover password « |
| [Article] | add comment | [Article] |
In his book "Integration-Ready Architecture and Design", Jeff Zhuk states that today's software engineering practices suffer from one serious drawback: the non-reuse of common algorithmic knowledge. Copyright notice: All reader-contributed material on freshmeat.net is the property and responsibility of its author; for reprint rights, please contact the author directly. For instance, any time an accounting application is written, it is written completely from scratch, despite the fact that:
Jeff Zhuk proposes to solve this problem by using knowledge technologies. As I understand this, algorithms which are used by an application should be extracted and put into a database. Then, when an application needs those algorithms, it connects to that database and uses them. In this way, different applications written in different programming languages benefit from reuse of algorithms. The idea seems promising to me, and I think that it is beneficial to try it on a simple application. In this article, I will present my thoughts about how algorithms can be extracted from the simplest application I can think of (apart from "Hello, world!"), a calculator. I will take an existing calculator application, JCalculator, "cut out" algorithmic parts, and put them into an OpenCyc knowledge database. Calculator application before changeBelow is a screenshot of the window of JCalculator.
The source code of the application consists of a single file, JCalc.java. Migration strategyOur strategy in "cutting out" algorithmic parts will be:
1: Find all algorithmic parts in the original JCalculator source codeWe will work more quickly if we determine what algorithmic parts we are searching for. In our case, algorithms are just arithmetic operations:
2: Refactor the source code so that all algorithmic parts are encapsulated in methods
We put all arithmetical operations into the class 3: Replace Java-based implementations of algorithmic parts with OpenCyc-based implementationsThis step is the most interesting. I want to show you several ways of doing things in OpenCyc. Therefore, the solutions presented here may be sub-optimal in real life. They are designed as examples for learning, not as production code. In the following sections, you will learn how to:
If you prefer reading source code to reading natural language texts, you
may look at the file Performing simple arithmetical calculations in OpenCycFirst, we need to know how to perform arithmetical calculations with OpenCyc. There is a language called SubL in OpenCyc which can be used to perform simple arithmetical operations. Perhaps the best way to tell you how to work with SubL is to demonstrate it on practical examples. So, if you can, perform the steps described below on your machine. Note that while preparing this paper I used OpenCyc 0.7.0b for Windows.
In the following table, in the column Example, you can see expressions which must be entered at the CYC prompt in order to execute a particular operation.
In order to execute such computations in OpenCyc directly in Java, we
have to use the
The really important line is the following:
For clarity, let's reformulate this line into:
The Implementing factorial and exponentiation is a bit more complex, requiring programming in SubL and CycL. We discuss this issue in the following section. Programming with SubLThe SubL language is based on LISP and, according to my first impression, enables the programmer to implement routines of any complexity. Let me give a short bit of background information for readers who are not familiar with the declarative style of programming. At the beginning of the information age (the 1960s), two languages were born: FORTRAN and LISP. FORTRAN was the flagship of imperative programming. In imperative programming languages, the programmer tells the machine what instructions have to be executed in what order. The order of instruction matters, and incorrect ordering of instructions is a frequent cause of errors in imperative programming languages. FORTRAN is the root of a large family of imperative languages, to which C, C++, C#, and Java belong. LISP was the flagship of declarative or functional programming languages. In these languages, programs are similar to mathematical models (collections of formulae). They describe the final result of the calculation and specify all functions (in a mathematical sense) which are necessary to calculate this final result. The order of execution of instructions does not matter. Several programming tasks can be solved in declarative programming languages more quickly (i.e., with less code) than in imperative programming languages. LISP was the foundation for such languages as PROLOG and Haskell. Let's return from distant history to our current task: we need to define the factorial function. In SubL, this definition looks like this:
This code fragment defines function Calling CycL functions from Java
There remains the last task, to implement exponentiation in OpenCyc.
There is a predefined function
The CycL query for exponentiation is The second step is incorporated in the following Java method:
This method simply executes the aforementioned query and returns the result as a double value. For more details, I recommend you read the OpenCyc documentation. But if you really want to learn how to work with OpenCyc, reading the docs won't help much. In this case, I recommend you study the files:
At least for me, I gained more by reading source code rather than reading docs. Final wordsSo now, we have attained our goal. We took an existing application, extracted its algorithms, and put them into the OpenCyc database. It's time to think about what to do next. Knowledge-driven software architecture as proposed by Jeff Zhuk and demonstrated in this example is a rather new technology. As with everything new, it seems very promising. One can read papers on recent software engineering trends like a thriller. There is a constant feeling that something very big, even revolutionary is crystallizing in the minds of the authors of those publications. At the same time, these feelings are very abstract. I often have only a vague idea about how to apply those revolutionary ideas. In my opinion, the time has come to reason about knowledge-driven architectures by means of practical examples. One line of code tells more than a dozen natural language words (except, perhaps, if you are coding in COBOL). So, currently, the most needed things are simple examples which demonstrate what one can do with OpenCyc. This paper is an answer to the question "What would a calculator application look like, if it were implemented with OpenCyc?". There are many other questions which require answers in the form of code examples:
After reading this article, you can think about your own questions which were triggered by reading it. It would be great if you shared your thoughts with other people interested in improving current software engineering practices (including me). Appendix: Related materialsHere is a list of materials related to this article:
Author's bio: Dmitri Pissarenko is a 24-year-old Russian software engineer currently living in Austria, Europe. In 2002, he received his Bachelor of Science degree (Computer Science) at the University of Derby in Austria. From 1999 to 2002, he worked primarily as a Web developer with a focus on PHP and ASP applications. Since 2002, he has been working on a Java-based process modeling system. He is not married and has one son. T-Shirts and Fame! We're eager to find people interested in writing articles on software-related topics. We're flexible on length, style, and topic, so long as you know what you're talking about and back up your opinions with facts. Anyone who writes an article gets a t-shirt from ThinkGeek in addition to 15 minutes of fame. If you think you'd like to try your hand at it, let jeff.covey@freshmeat.net know what you'd like to write about. [Comments are disabled]
[»]
Basic Research vs Applied Science Thank you for your deeply thought article. Yes it is too early yet to see
practical applications using Occams test, but breaking out of boxes is what
prevents us from progress. Think about how many light bulbs Edison tried
before he found carbon filament! We are simply too lazy to ask big
questions. If you keep working in the direction of your intuition, you
may invent the next computer language. Those people who learn to think
out-side-the-box unfortunately have to put up with lots of negative energy.
That is the main reason why leadership is so hard. It is also the
satisfaction. --
[»]
A fine example of what not to do This article provides the perfect illustration why algorithms are seldom reused. The author takes simple algorithms like a+b which can literally be implemented in three characters and blows them up into an object-oriented API a few hundred lines long. I doubt he will find any programmer who would want to use it. They will take one look at it, snort with contempt, and write their own. You see, it is not enough to abstract your algorithms; you have to know how to design a good API, and the first criteria for the latter is simplicity. If a programmer thinks he can write something smaller and simpler, he will do so. The desire to optimize is ingrained deeply, and it takes a lot of talent to design reusable code that will impress its intended users enough for them to accept it. It is a talent that the article's author obviously does not possess yet. I would like to end with the immortal words of John Carmack: "Get your fat API out of the way and let me at the iron!" and a friendly admonition to the author to take his ugly bloated API elsewhere until he gets a few more years of experience and learns the skill of good design.
[»]
Re: A fine example of what not to do Mike wrote: This article provides the perfect illustration why algorithms are seldom reused. The author takes simple algorithms like a+b which can literally be implemented in three characters and blows them up into an object-oriented API a few hundred lines long. .... I understand Mike's position (and I even could share a bit of it) but the whole idea of the original article by Dmitri Pissarenko is not to generate human readable (or re-usable) code (even in Java) but runnable code. a Java compiler already generate lots of complex code. If you define in Java a method on your subclass of Integer-s which add them, the generated machine code will be hundred times longer that the assembler code produced by a seasoned human assembly-language programmer which does an addition (this being probably a few assembler instructions). The good insight in Dmitri Pissarenko is to use knowledge systems to generate programs. This has already been tried (in particular, Jacques Pitrat is working on this theme for more than a dozen years with his Maciste system). Unfortunately his system is not "open source", and is not very documented (but he published some interesting papers on it). I do agree with Mike that Dmitri's example is not a particularily good one (a calculator is quite simple, and the issues involved are not algorithmic, but of software design, and highly related to graphical interfaces, not to additions!); however, to give good examples for this you need a lot of work, and the article you would write would be much too long to be accepted on Slashdot (or even elsewhere). But generating code by knowledge system is in my opinion a very interesting and fruitful idea (but it is hard to implement in practice, because only big examples are credible). Also, most of the programming effort is not in designing new algorithms, but in choosing and mixing (i.e. combining) existing algorithms. Here, a reflective based [meta-] knowledge system would help a lot, and such a system would generate code. I actually could write a lot more on this, but I don't have the time and the incentive; however, you might look for Jacques Pitrat's papers and books, and also look into the Tunes site (which contain a lot of interesting, even if it is old, blurb, but no real working code). --
[»]
Re: A fine example of what not to do See also the Speaking about oneself (.doc) Pitrat's paper, and Implementation of a Reflexive System paper (not freely online, Future Generation Computer Systems 12, pp. 235-242, 1996) both by J.Pitrat --
[»]
Programming is not about algorithms
[»]
Re: Programming is not about algorithms
quoting Mike: Trying to replace programmers with a machine again? But the goal of partly automating programming tasks has been sucessfully pursued since about 50 years. In 1960s, compilers like FOR[mula] TRAN[slation] had exactly this goal (avoiding human assembler programming, and having the assembler code generated by a program, which is now called a compiler, and today human coders don't code in assembler anymore...). Likewise, a compiler made in France in 1959 was called Programmation Automatique des Formules (automatic formula programming), because its translated a Basic-like language into machine code (on a drum computer, the CAB500). So my point is that software engineering, and programming language design, and compiler implementations is all about somehow automate or assist the task of programming computers (remember, computers only understand machine code). Some knowledge based systems may help in this task. There are lots of code generators today, and many areas have domain specific higher-level languages. Overall, I believe we agree. I don't claim that any fully automatic software will do all the human work in a foreseab le future. But the half-century trend is in assisting human programmers in working faster and better, and I do feel that some progress has been made since the 1950s (we human programmers are not working the same way and with the same tools as 50 years ago). BTW, the PAF language was designed and implemented by my late father in 1959 (the year I was born), and I know for sure that I am not programming the same way as my father did. He coded in machine code, mostly with a pencil and paper; I'm coding in Ocaml or C, mosty with Emacs (on a PC/Linux computer costing 50 times less, and running 100 000 times faster, with 1 000 times more memory than in 1958). So we both agree. We might have a disagrement on what is called the strong A.I. hypothesis, but this is not very relevant here. Respectful regards. --
|