Category Archives: Thoughts

On planning for the future

Growing up, I looked up to a lot of heroes for motivation, and Steve Jobs is one such inspirational hero for me. He was influential to a lot of the values I hold true today. I had read his autobiography – multiple times – and I think it is one of the best books I've read. I wish I had been able to be join Apple while he was still around, to work with him, and experience the reality distortion field™ in person.

Most of the conversation I have with my friends and family would generally contain topics of what my ideas are, and how I plan to achieve them. They are interest in what my values are, how I make decisions. Usually I end up paraphrasing things that I've before, to others.

I want to summarize some of the principles I hold, as best as I could. Hopefully I don't forget too much.

Principles

On What You Should Do

In the FB post, I talked about two principles (that help decide what you should do).

  1. Do the things that inspire you, that you love, and
  2. try to be really good at it, hopefully having commitment and courage to do it.

Let me elaborate on this, and then talk about some additional principles for how to get you to the goals you want.

Firstly, by doing the things that inspire you and that you love, this results in personal satisfaction. It's very important, because it keeps you motivated for when the problems get tough – and there are going to be a lot of tough problems.

Anything worth doing is going to be difficult.

Of course, the assumption is that these things that you do are going to be creating value; whether it be for yourself or others. Just don't use this as an excuse for binge playing video games, or binge watching Netflix.

Secondly, trying to be really good at it is important for you to create something of value. It's important to have both breadth and depth – and being good at something gives you depth, a competitive edge over others.

Sure, there may be others that are smarter than you. But they have only 24 hours in a day, just as you do. So that places a limit on what others can possibly do, during their spare time. You don't have to be the smartest to have a competitive edge – it's sufficient to excel at skill, albeit seemingly small.

Commitment and courage are other necessary conditions to developing the skill necessary. You should already have the motivation and desire to pursuit something you like doing; whether it be solving a problem or becoming really good in a skill.

The second point complements the first.

These two principles are just about sufficient to be a guide on how to decide what to do, what your next target is, but not how to get there.

On How To Get There

Now that the objectives have been set, what determines the success of it is how to get there.

  1. Manage your free time. We all have 24 hours, how that time is spent that really matters.
    • Identify things that you should invest your time in, in order to reap the fruits of your labor.
    • But also balance life and work. Allocate your most valuable resource, your time, well – with significant others, working out, even video games. The important concept is moderation. For example, watching a season of Netflix is probably no use to getting to your goal.
  2. Read more. Non-fiction. On topics that immediately relevant to you.
    • The value of information depends on context. If I told you what the stock price of some company would be in the future, that would be very valuable; on the other hand, stock price in the past is less valuable.
    • Knowledge is key. Sharing knowledge is also key. Don't forget about your peers.
    • Reading can give you a head start, often what your peers cannot obtain. Compared to others, you are making more informed decisions, better strategies and tactics. Against competition, even the smallest leverage or advantage makes a difference.
  3. Build projects. Develop your skills.
    • But don't work in isolation. Get feedback to improve faster.
    • Success and failure doesn't matter as much; there are things to be learnt even within failures.
  4. Network with others, build your connections.
    • The brain's neurons work with others to achieve a common goal. A single neuron can't achieve much on its own. As I said earlier, anything worth doing is going to be difficult. And difficult things need more resources to be tackled.
    • In machine learning, a neural network is inspired by the neuron architecture. When you learn, you are making new connections; and when you practice, you are strengthening the connections.
    • A strong network makes you better informed, helps you get relevant information quicker, creates more opportunities.
  5. Start today, rather than tomorrow.
    • Why is today such a good place to start? Starting today gives you a head start than starting tomorrow. And since you didn't start yesterday, then today is the optimal day.
    • Take a small step, begin by just doing a little. Great accomplishments don't happen overnight. Great Wall of China wasn't built in a day.

That's all for now folks. Thanks for reading!

PS:

I'm writing this while at the Starbucks on El Camino Real near my house, and coffee's power is fading. I will try to contribute and share knowledge more often.

If you found this blog to be useful, let me know! My email is jason@sunapi386.ca and Facebook is https://www.facebook.com/sunapi386 – feel free to contact me.

Music: Driftmoon.

How I met Sophie

This is a tale about how I met my Sophie. Happy Valentine, Sophie! 情人节快乐!

In February 2016, I spent the first two weeks travelling in China and celebrating Chinese New Year. Having family across the world felt great, and I could visit them any time. China is in an economic boom, and growing very fast. It felt good to be connected to other parts of the world. I had recently graduated university and moved to the Bay Area for just about 3 months. I still felt there was a lot of travelling I could do, to explore the world. But travel is costly, from the time and financial perspective. Is there is another way?

Becoming a hosting on couchsurfing.com

From December to end of March, I was renting the master bedroom in a two bedroom apartment. Since our apartment had two couches, I felt I could make use of them to use it to host some couchsurfers and make friends from all over the world. So I subscribed to the couchsurfing email newsletter.

One day while scanning through my email of couchsurfers looking for hosts, Sophie's profile photo caught my eye. She was very pretty. Her profile described herself as a high level headhunter, ambitious, eager, and interested in startups.

I offered to host her on the first day she lands in US. But she replied with moderate interest, and describing an overly ambitious schedule of visiting too many cities.

Unfortunately, she had found a host already, but she was willing to meeting up for coffee. We met on Sunday Feb. 21, 2016.

That Friday and Saturday, I was hosting my first couchsurfer, Satoshi, from Japan. His profile described him as an aspiring entrepreneur visiting SV. I was hosting him, my friend Shaon was looking to buy a house. While we drove around, I was messaging Sophie to see when and where she can meet up with us.

Meeting my 2nd couchsurfer

At the end, we settled for meeting at my home, and both Sophie and her host met my friend Shaon and Satoshi.

When I first saw her, I had two impressions: "oh my goodness she is beautiful, cuter than her profile photo" and followed by "oh she's shorter than I thought".

After getting acquainted, we all went on a spontaneous trip to see Half Moon Bay and try to catch sunset. I briefly chatted with Sophie in the backseat, before doing the driving myself.

Returning, we went to visit Facebook, which was a fun place and suitable for visitors. Apple had no such open door policy. Sophie and I chatted some more about who we were.

Taking silly photos at Instagram HQ

Taking silly photos at Instagram HQ

Towards the end of our visit at Facebook, I had made a good impression with Sophie. She had changed her mind about not staying over and asked her couchsurfer whether it was okay to stay with me. Of course it was.

First impressions

We chatted late into the night. And as I closed my eyes, I knew I liked her. A lot. I was so excited that I woke up at 5 am to blog. And then after I was done, I woke her up to chat.

We had talked and next thing I know, it was morning. I still had to work, but I decided to take half day off to get up later, show Sophie around my office, then drop her off at Stanford University. And I missed her immediately.

Date #1: Las Vegas

Immediately after dropping her off, I convinced and coordinated with her so I can come to Las Vegas and travel together. Below is a cute video taken at Hoover Dam. I had rented a yellow Camaro. I had always wanted to drive a yellow sports car!

After this trip, Sophie had returned to China. We decided to try long-distance dating, and we did video calling consistently, about 3 times a day, 30 minutes each time.

Date #2: Hawaii

After a while, we wanted to meet up again. My family wanted to do a big vacation, and we settled on going to Hawaii. I invited Sophie to meet with my family.

After the trip to Hawaii, we still continued to do long-distance, and we meet up again. This time I wanted to meet her parents. So I visited China.

Date #3: Shanghai

Being with Sophie in China is like being a tourist. Although I had gone to Shanghai before, having Sophie show me around is a completely different experience.

I had remembered to take some videos this time. (I didn't have videos from Hawaii trip. Shame.)

We are now progressing well.

Ember rails developer

Background

Since April this year, I essentially transitioned from an application developer to a web developer. Full stack.  It's quite exciting! There are some who perceive web development as an inferior choice of work, but I have to disagree. In an era where mom and pop shops have websites, not having a webpage front for your business essentially means you don't exist. Because having a webpage allows you to reach a greater audience. So, I see it as a very powerful leverage.

I have a little background developing for Ruby on Rails, from my Dotabuff scraper project. Now that I look at it, my understanding of rails is fairly incomplete. Rails, and Ember alike, are scaffolding heavy. The intuition is that as the creators of the web framework, they are more knowledgable about how you should go about designing the architecture than you are. Consequently, it means transitioning from one Ember/Rails project to another means very little time is required.

What is Rails/Ember

Rails is a complete web framework. It handles the entire web pipeline. From

  1. routing a URI,
  2. to finding the controller for the page,
  3. who translates the URI to some requests on the data model and accessing the database,
  4. to finally rendering the web page template.

Rails 5 (which should come out of beta this year) has an API-only mode. That means it doesn't template web pages (no step #4). Rather, it replies with the data directly in JSON format.

Ember is like Rails, but it's designed to handle the frontend, leaving the backend architecture up to the user. It means the focus of this framework is on rendering web page template. Any dynamic data that needs to be displayed gets translated to an API call to go fetch the data. Then it converts the response into an ember-data model, and finally fills in the template to display it.

Why make it more complicated by using yet another framework just for the frontend?

So at this point, you're probably wondering why the additional complexity? Why not just use Rails since it supports the entire process?

Well, you see, it's about speed and responsivenessEvery time you visit a different web page on Rails, an entire html file is rendered for you by the Rails server and sent to you. 

Think about that. What does it mean? It means it's slow. Unresponsive.

Rendering pages cost time. Sending the rendered page across the network also cost time. The larger the rendered file size, the longer it takes. And this is not cheap, you notice the web page flashing when you visit different pages.

So that begs the question of how do we avoid sending out rendered pages from the server? Rendering could be done on the client, rather by the server. This is where frontend frameworks come in. Rather than requesting entire web pages all the time, just request the data the page needs as an API call, and render it on the client. In between visiting different pages, data that was available before is put into client's local storage, for faster access via the cache.

Frontend frameworks

Ember is one of three popular frameworks. It's also the most complete framework, as in the most to learn. In terms of scaffolding and complexity, they rank like this:

  1. Ember. The framework for ambitious projects.
  2. Angular. Somewhere in-between.
  3. React. So lightweight, it can hardly be considered a framework.

Any of these would be fine. Indeed, as with any beginner, I did some digging. And discovered that the bigger I dig, the more mess I got myself into. As a rule of thumb, I certainly knew I had no clue what I'm doing with a frontend framework. So if I'm to learn a frontend framework, I might as well learn from what are the best practices, the most scaffolding frameworks. For the same reasons why I prefer Rails as well.

My project

Currently I am building an internal project for Apple, and I (obviously) can't disclose what it is. But it uses Ember and Rails 5 API-only mode.

I setup both Ember and Rails to use web-sockets, but not really hooking them up at the moment, because I'm working on getting out a functional demo website. When the time comes, this website will not only be responsive, but updates in real-time.

See you at my next blog post folks!

Learning Cocoa

Backstory

Haven't blogged in a while, been caught up with transitions to work from just graduating. Recently I started learning Cocoa for work. I'm not completely new to making apps using frameworks, but still pretty fresh out of school.

I have some experience with Android, so this post will describe my experiences learning Cocoa coming from more Android side.

Cocoa vs Android

I found the whole Cocoa framework to pretty acceptable and usable. That being said, tutorials for OS X Cocoa are pretty non-existent, and iOS is marginally better. So unless you pay to buy books, expect heavy digging. Buying a book from Big Nerd Ranch is probably your best bet. Android is much better to learn as a beginner. And then the transition into Cocoa isn't too bad.

If only Apple spent more resources developing and maintaining a set of tutorials for Cocoa, like Google does for Android, I think Cocoa would become slightly more popular.

Of course, the barrier to entry for Cocoa development is still limited by how costly it is for a setup. You'd need iOS or OS X device, and pay $99 for developer access per year. Where as pretty much any phone can run Android, and it's free to develop.

Cocoa's IDE: Xcode vs AppCode

But being a programmer by trade, I'm a little hesitant of doing any work that involves heavy use of GUIs. And developing for Cocoa you'll almost certainly be using Xcode. The alternative is to use another IDE or use the plain old text editor.

Let's face it. You can't manage this with text editors. Given how some files are structured, like XIB files, it pretty much screams at you to use an IDE.

XIB stands for the XML Interface Builder. Interface Builder is a software application which allows you to develop Graphical User Interface with the help of Cocoa and carbon. The generated files are either stored as NIB or XIB files.

If you're looking at IDEs, the major competitor to Xcode would be AppCode. Now Xcode isn't bad itself, anyone who've ever used JetBrains software knows that Xcode's compiler reasoning is garbage. Definitely AppCode wins here. Not only here, but across the board for compiled language IDEs. Making recommendations is their specialty. Ironically, because AppCode is put so much effort here, it isn't very good for making GUIs like Xcode. Among other things utilities like Instruments (monitoring your app), etc. But keep in mind I have used JetBrains stuff longer than I have Xcode, so I have my biases here.

In short, AppCode for writing code. Xcode for making interfaces, because Interface Builder is king.

Programming Paradigms

Both Android and Cocoa follow similar patterns, mainly MVC. Sometime in the future, I want to learn more about Reactive Programming, rather the observer pattern. But for now, I can't comment much, because I haven't got anything to compare to.

Conclusion

Well a lot more can be said about programming for Cocoa. But that's all the blogging I will do for now. Sleep, and I may come back to this post.

Customer Support Platform

Customer Support Platform

October 1, 2015

Increase customer support efficiency by using preformed answers and optionally modifying it before replying to customers.

Goal

Build an API as demo to investors, about 3 weeks away. Basically a customer hands over their customer support chat logs, we provide back query-responses through API. The (reiterated) version of the problem is: retrieve relevant responses (from previously seen responses) based on customer question.

  • Treat customer question as a query.
  • Retrieve a reasonable response.
  • The meat of the problem lies in creating a good mapping from query to a response.

Due to the timely nature of building a demo in short time, I look to using pre-existing tools rather than develop an entire process from scratch. Obviously it’s hard to publish any papers on using existent techiques, but our goal constraint involves more engineering than research.

Pre-existing tools approach

Apache Lucene

Apache Lucene, arguably the most advanced, high-performance, and fully featured search engine library in existence today—both open source and proprietary. But since it is a library only, it would be difficult to get started - you’d need to build around the library. This is the search engine library used behind Wikipedia, Guardian, Stack Overflow, Github, Akamai, Netflix, LinkedIn.

  • Lucene has pluggable relevance ranking models (NLP information extraction and sentiment analysis) are built in, including the the Vector Space Model and Okapi BM25.
  • The power of Lucene is text searching/analyzing. It’s very fast because all data in every field is indexed by default. Text searching focused applications should definitely use Lucene.

There are two predominant platforms built on top of Lucene. Apache Solr, and Elasticsearch. These two are built and designed for full text search on top of Lucene. Both are open source.

ElasticSearch is friendlier to teams which are used to REST APIs, JSON etc and don’t have a Java background, so we’ll run with that.

Elasticsearch

Elasticsearch is also written in Java and uses Lucene internally but makes full-text search easy by hiding the complexities of Lucene behind a simple, coherent, RESTful API.

  • Also pluggable ranking models! This is important to try different approaches to getting good customer results. The modularity of this means we can build one pipeline, and improve our response by using different ranking models.
  • Can be plugged with our own custom ranking functions. For instance, we might care about
    • Information decay, where more recent responses snippet at the top.
    • Ranking based on uses and non-uses of a response snippet.
  • Customer’s questions treated as query input, and support agent’s responses treated as snippets to look up.
  • References
Searching
  • Relevance: Elasticsearch’s main advantage over a traditional database is full-text search. Search results are sorted by their relevance score. The concept of relevance is completely foreign to traditional databases, in which a record either matches or it doesn’t. See Full Text Searching.
  • Phrase Search: Sometimes we want to match exact sequences of words, phrases. Use the match_phrase query in Phrase Search.
  • Highlighting: Although not super important, we can highlight the snippet that matched our search. Highlighting.

Ranking Models

Using a good ranking model is the meat of the problem. Famous ranking models:

  • TF-IDF What is TF-IDF? The 10 minute guide Wikipedia TF-IDF
  • BM25 is regarded slightly better in our case than TF-IDF.
    • Quote from Similarity in Elasticsearch: There is a reason why TF-IDF is as widespread as it is. It is conceptually easy to understand and implement while also performing pretty well. That said, there are other, strong candidates. Typically, they offer more tuning flexibility. In this article we have delved into one of them, BM25. In general, it is known to perform just as good or even better than TF-IDF, especially on collections with short documents.
  • Consider taking Coursera on NLP, learn more about ranking models.

These two above are considered statistical analysis. In recent years, fundamental break-throughs were archieved using machine learning, specifically with neural architectures, in several subfields of AI – computer vision, speech recognition, machine translation. Consequently, more advanced ranking models could be derived from approaches in neural networks.

Training Data

Evaluating any prediction or recommendation engine relies on having a good set of data. The Ubuntu Dialogue Corpus is one such dialogue dataset.

Ubuntu Dialogue Corpus

The Ubuntu Dialogue Corpus, introduced by this paper, contains almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. Along with introduction of the dialogue corpus, the paper also discusses learning architectures suitable for analyzing this dataset.

Specifically, the following architectures are benchmarked for performance:

  • Term Frequency-Inverse Document Frequency (TF-IDF, which is what is used by the Elasticsearch/Lucene engine)
  • Recurrent Neural Network (RNN)
  • Long Short-Term Memory (LSTM) architecture

Performance evaluation is based on the task of best response selection, without human labels. The agent is asked to select the k most likely responses, and it is correct if the true response is among the k candidates. The family of metrics used in language tasks is called Recall@k. For example, k = 1 is denoted as R@1.

The observed result is that the LSTM outperforms both RNN and TF-IDF on all evaluation metrics.

Daerli Chinese Conversation Log

A confidential corpus of support dialogues are to be used in our testing, as customers involved are Chinese companies.

Types of code commenting

In the chapter on commenting, McConnell divides comments into six categories. Figure 3.1 gives summarized versions of his definitions, with some of his commentary.

  • Repeat of the code: States what the code does in different words. Just more to read.
  • Explanation of the code: Explains complicated, tricky, or sensitive code. Make the code clearer instead.
  • Marker in the code: Identifies unfinished work. Not intended to be left in the completed code.
  • Summary of the code: Distills a block of code into one or two sentences. Such comments are useful for quick scanning.
  • Description of the code’s intent: Explains the purpose of a section of code, more at the level of the problem than at the level of the solution.
  • Information that cannot possibly be expressed by the code itself : Copyright notices, confidentiality notices, pointers to external documentation, etc.

Why I Think Now Is A Good Time For Machine Learning

As we all know, since the mid 18th century, when the scientific methods were established, we have gone through a few technological revolutions. Most recently it was the industrialization revolution, and now information is the current revolution. The availability of information is more noticeable with the decreasing cost of storage. Now it is easy to acquire large amount of information; what to do with it is the question. The answer is machine learning. To participate in the information era, you can start off learning about machine learning.

Learning machine learning has a few benefits, which I'll talk about.

  1. There are patterns in the large quantity of data, but it is infeasible for humans to analyze.
  2. Success = Opportunity + Preparation. And there are lots of opportunities. We need the preparation.

Clearly, it is in your favour to study machine learning, so you too can develop the necessary tools to deal with large amount of data. In other words, the availability very large data sets is one of the resources fuelling the information revolution. It seems obvious that being able to utilize this information is key to being part of the current information era.

/more to come