<$BlogRSDURL$>

Friday, December 17, 2004

Quantity and quality of research 

I found out that publishing a paper in a journal or in a conference is rather easy for those who have good contacts. I agree, it is not completely true as research community strives to get real results, but some of the papers we come across are of so inferior quality or they lack so much that it is waste of space in the journal publication.
If you know someone and he/she is sent your paper for peer review, it is very likely to get accepted. Also not all the reviewers take out time from their busy schedule to verify authenticity of these results. Prestigious journals and conferences seem to be spending lot of efforts to ensure only quality papers get in.
3-4 universities come together and set up international conference collaborating with other nation's universities. I am surprised at how much funding is provided by government organizations to promote research but the quality has to be maintained. One hope expressed is that once the conference or journal readers base is established, they will start accepting more challenging papers with true quality metric. New journals and conferences are good for "half-baked" ideas and it should remain that way.

Monday, December 13, 2004

Evolution of Information 

There were couple of thoughts expressed as a part of feedback during my dissertation proposal presentation that I would like to discuss. The term "Information Evolution" is new and has been introduced by me and Dr. Bayrak together. What this means is that the term has no prior significance and we need to define what it means. Evolution is about the never-ending change. We consider information to be of evolutionary nature as it changes its form and medium.
How exactly can such a phenomenon be justified. Well, it can not be because information evolution has side effects that can be observed but not the actual changes. The changes in the structure of information are noticeable but how does one measure the internal or intrinsic changes in the nature of information. We refer to evolution of information from semantic or meaning point of view. The meaning or say the implied meaning of a particular information changes with respect to context. What this means is that a particular word might have a particular meaning in one context but a totally different meaning/sense in another context.
The evolutionary behavior of words has been studied as "Word Sense Disambiguation" problem from Natural Language Processing [NLP].
I suggest we use multiple answer set paradigms to represent multiple sense of the word and depending on the on text the attempt can be made to disambiguate the actual sense in "that" context.
The other issue raised is about my statement in the proposal "Information Evolution is a superset of Information Retrieval" IE is superset of IR according to me because, IE can not only effectively search for the information to perform IR related tasks but can also reason about it. The main objection is that IE approach does not produce better accuracy retrieval of documents that existing IR approaches. According to me, IE is suppose to produce better results than IR. The onus in IE s not just on accuracy of the results but also on 2 other objectives
1. Better reasoning about acquired knowledge/information
2. Better understanding (semantic) of the information acquired.
Now in order to achieve both of these, we need to store the information not in the best efficient form from retrieval point of view, but we also need to store the context, which means the unnecessary or unwanted information is being stored from IR point of view. Then how can we produce better results than IR when the objectives here are multi-fold.
Dr. Xu asked me the question during my proposal presentation that "What is the rationale about using sentence based LSI approach instead of existing word-based approach". This is my response:
Myself and Dr. Bayrak, both are of the opinion that the sentence preserves true context of the information. Context is important as meaning changes from one context to another, depending on how it is being used and also in what context/reference it is being used. So using existing bag-of-words approach will yield better results at the expense of loosing context related information about the acquired knowledge. The only way to preserve it is to read like human mind, in terms of sentences. So we preserve sentences in order to preserve context. If the word-by-documents matrix is formed as in existing approaches, the word-based queries can be answered by using cosine similarity of each document vector to the query vector. Now we form sentence and store the correlation delta values of each sentence in the matrix. This means, we keep all the words but still form sentence-by-documents matrix. It has two inherent advantages in This approach.
1. The sentence x docs matrix has considerably less number of rows compared to words x docs matrix.
2. Unlike word x docs matrix, the sentence x docs matrix still preserves the context of the information as it was presented to us.
I know some researchers will not agree with this philosophy but I strongly believe this is right. Sentence based LSA is the right approach. The only answer that needs to be justified is whether the correlation delta is true representation of the characteristic behavior of the implied text semantics. We will find out!!

I am back with a good news 

It has been over a month that I am writing a blog. I am just too lazy to do things. Entire November was blogless for my research diary. What did I do this month. Well, here is the recap.
I presented a paper on "Semantic Information Evolution" at St. Louis conference on Nov 8th. St. Louis is a nice place. It was good learning experience for me to present first research paper out of my university. People from all over the world are present to express their views/ share research work with one another. There is a little sense of ego and pretension which I could sense but I was actually expecting it would a lot worst. With knowledge humble heart is necessary but most of the times it is the egoistic hardened views that prevail. I hope and pray I do not end up like that.
Anyways, after the conference I worked on my research proposal. Before thanksgiving break I had finished couple of revisions of this draft. My initial draft of proposal was really badly organized and Dr. Bayrak had to give me inputs in order for me to improve it. Subhashish dada, my brother-in-law helped me get it into shape where I could get Dr. Bayrak's approval.
Early December i.e. around 5th of December I got feedback from Dr. Bayrak approving the proposal. That meant that I need to give a hard-copy in a folder to all the committe members and see if they would have some free time for me to present this proposal before Fall 2004 semseter end. As it turns out, I presented the proposal on December 10, 2004 at 10:00 AM. It was anxious moments for me as I was not sure how the committe members would take it.
I am going to write a separate blog-post on how committee members found my dissertation proposal and also my thoughts on some of those issues. But to keep it short here, the proposal has been accepted and few suggestions have been made by the doctoral advisory committee that I will discuss with Dr. Bayrak in coming days.

This page is powered by Blogger. Isn't yours?