Noticing the ordeal James went thru recently and the business news I read in the paper about one of the local telecom companies setting up some sort of free no-call online list got me revisiting the idea.
It is actually an interesting and worthwhile project. A database of the no-call list. You send a query containing one or more telephone numbers to the application server, and it returns those number(s) in your original query which are not on the no-call list.
Now some figures to estimate the scale of the project. The OFTA publishes some statistics about the Hong Kong telecom market. What we are interested in is the total number of phone numbers in Hong Kong. As of June 2007, there are 3,861,898 exchange lines, 364,714 fax lines and 9,569,641 mobile lines, i.e. totally 13,796,253 Hong Kong 8-digit telephone numbers in use. I took a very brief look at the growth of mobile subscribers over the years, and it seemed to me that the annual growth rate is about 15%.
We need to store the numbers on the no-call list in some sort of database for the lookup. The standard 32-bit integer (capable of representing the range from -2147483648 to 2147483647) would be quite adequate for storing any number available in the current Hong Kong 8-digit telephone number system (i.e. /[1-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/). In fact the same 4 bytes would be good for storing any number in a 9-digit telephone number system. Assuming that the no-call list database would need to be able to store ALL the regular 8-digits phone numbers in use today (well, June 2007 is close enough), it would take up 13,796,253×4 = 55,185,012 bytes, i.e. 55MB. The real list will be much smaller, and could in any case easily fit entirely in RAM, and the performance bottleneck would be in the networking I/O. One doesn’t even need a very powerful server for this. At 15% annual growth of telephone lines (land line growth rate should be much lower), it would take quite a few years before our database grows to over 100MB, which is still well within the capability of today’s computers.
The lookup application would work like this: the user prepares a list of phone numbers which need to be checked against the no-call list, and submits it either via an API or cut-and-pasted into a web form. The API could be based on REST and JSON, and the web form is pretty standard.
To provide this service, we need the following pieces of software/equipment.
- Web server to answer queries. For maximum efficiency the web application would be some sort of long-running process written in Erlang, Perl or Python, or even tcl. The data structure is a simple read-only list, and may be stored entirely in memory using the built-in data structure of the language used to implement the web application. A SQL server is simply over-killed for such a simple data structure, no it is not needed. The web application may be restarted to re-read the no-call list, or read the latest changes to the no-call list from the backend, say every hour.
- Backend for administrating the no-call list. We would need to be able to add to and delete from the no-call list. We would probably wish to store the timestamps of all phone numbers added, and may wish to record the timestamps of any change of state for any number which has been added to our list. We may use a SQL server for this. A separate backend may be necessary because of the tricky stuff it may need to do for authentication and abuse-prevention.
- Authentication. It is easy for the 9M mobile phones out there using SMS. To add your mobile phone number to the no-call list, send an SMS to the system phone number. The backend would read the SMS and add the sender’s phone number to the no-call list. To take your mobile phone number off the no-call list, send an SMS containing a keyword, say “hell” or “remove” to the system phone number. This is to make sure that it is slightly more difficult to accidentally take oneself off the no-call list.
Authentication for land lines is slightly more complicated. Caller-ID is not always accurate (PABX systems are the biggest problems) so we do not really want to trust it (except for the phone numbers starting w/ 6 or 9). We could use some sort of telephony system. The user would call the system phone number, and the system would then speak out the caller-ID of the caller and ask the user to press a certain key to accept adding the caller-ID number to the no-call list. For a user with no accurate caller-ID, he would have to submit the phone number to a web form and the system would then dial the phone number for manual confirmation. Restrictions on submission-rate (i.e. repeated submission of the same phone number within a short period of time) and failure-to-authenticate-rate (i.e. no action after submission of a phone number) need to be implemented to prevent abuse.
It is quite easy to do if the whole thing is done online, i.e. no telephony facility needed. The telephony part may be implemented using Asterisk but we would then need an account w/ a VOIP supplier. The whole thing should take a competent programmer no more than a couple of weeks to do.