Feb 17, 2014
Table of contents:
One of the most common features of “Web 2.0” applications is the News Feed. A News Feed is basically a single stream of activities, posts or updates from the people you follow or are connected to in the application.
The News Feed concept was of course made popular by the likes of Twitter and Facebook, but it’s a pretty common feature in just about every modern Consumer application. Even Enterprise applications are getting in on the act with the consumerisation of enterprise software.
In this post I’m going to walk you through the basics of getting a News Feed up and running in your application.
So before I begin writing code, first I’ll clarify what I’m trying to build here. Basically I want a user to be able to see a unique stream of posts from all the users she follows, which also includes her own posts.
Whilst their are many intricacies that go into building a News Feed like Facebook’s, when trying to eat an elephant, we have to go one bite at a time.
In order to create this stream of posts, I need to pull the latest posts from the database from all of the users that the current user follows, including her own.
Whilst this is technically not that difficult to achieve, there are some very inefficient ways of going about it.
When trying to create something like this, I find it beneficial to get my head around the raw query first. Open up phpMyAdmin (or whatever database interface you are using) and follow along as we build up the query.
Note: Make sure you have a good mixture of test post data and different user relationships so that you know the query is working correctly!
For my test data, I created four users and some test posts. I then set user 1 to follower users 2, 3 and 4. I then set users 2, 3, 4 to follow either 1 or 2 of the other users at random.
So, the first bit of the query should be what columns we want to select and from what table:
SELECT * from posts;
However, this will return all posts
from the database.
Next we can refine the query by only selecting the posts from the current user (I’m just using the user_id of 1 as an example. In the real world this would be dynamic):
SELECT * FROM posts WHERE user_id = 1;
Now we’re a little bit closer, but we also need to pull all the posts from the users that the current user follows.
In order to also select the posts from the followed users, we can use the SQL IN operator:
SELECT * FROM posts WHERE user_id IN (2, 3, 4) OR user_id = 1;
Great, now we have a stream of posts from all of the users that the current user follows including her own posts!
This query technically works, but can you see what’s wrong with it? In order to generate the comma separated list of users (2, 3, 4)
we need to run a query to pull all of the user ids that the current user follows and then iterate over it to generate the list. We would then need to inject that string into the query above.
This sort of approach can work, but you will probably find that it is incredibly inefficient. Instead, we can use a SQL subselect:
SELECT * FROM posts
WHERE user_id IN (
SELECT follow_id FROM user_follows WHERE user_id = 1
) OR user_id = 1;
As you can see in the SQL above, we are using a subselect
to find the users who are followed by user 1.
If you now run that query on your database and alternate between the active user, you should see that the correct posts are returned in the result set.
With the database query now sorted we can start looking at implementing it using Laravel.
Note: Whilst the query we will be using here is more efficient than the first attempt, it still not going to be good enough to really scale up to become an Internet giant. The likes of Twitter and Facebook are required to be much more sophisticated. However, I’m a strong believer in just getting something working. If you have to worry about scaling your architecture, that is a good problem to have.
To read more about what goes in to scaling a service like Twitter, take a look at this article.
Now that we have the correct SQL statement sorted, we can convert it into an Eloquent method. Again as I mentioned above, I think writing the Eloquent method is much easier when you are 100% sure you already have the correct raw SQL query.
Much like the process above, in order to get to the right end product, we will take it one step at a time.
So as you probably know, in order to find all posts from an Eloquent model, you simply have to run:
$posts = Post::all();
However, we want to use the whereIn
method, so we will need to use:
$posts = Post::whereIn("user_id", [2, 3, 4])->get();
As you can see from the code above, normally you would simply pass an array as the second argument when you are selecting from a small sample. However, because this isn’t likely going to scale for our situation, we can pass a closure so that we can run the subselect
query:
$posts = Post::whereIn("user_id", function ($query) {
$query
->select("follow_id")
->from("user_follows")
->where("user_id", "1");
})->get();
Hopefully, the subselect
should be pretty self explanatory because it is just the same as the raw SQL from earlier.
And finally we need to also include the current user’s posts:
$posts = Post::whereIn("user_id", function ($query) {
$query
->select("follow_id")
->from("user_follows")
->where("user_id", "1");
})
->orWhere("user_id", "1")
->get();
As a dirty little test, copy this final chunk of code into a test route. Fire up the server and dump the $posts
variable so that you can see the correct results are being returned.
Now that I have the SQL query nailed and I’ve translated it into an Eloquent method I can start adding it into the application architecture that I’ve established over the last couple of weeks.
First I will add a method to the EloquentUserRepository
.
/**
* Feed
*
* @param int $id
* @return Illuminate\Database\Eloquent\Collection
*/
public function feed($id)
{
return $this->post->whereIn('user_id', function($query) use ($id) {
$query->select('follow_id')
->from('user_follows')
->where('user_id', $id);
})->orWhere('user_id', $id)
->get();
}
In this method I’ve used the Eloquent method that I wrote earlier but replaced the hardcoded id with the $id
that I passed as an argument. Notice how I’m passing the $id
into the Closure by using the use
keyword. If you are unsure about how Closures work, take a look at this post.
You will notice that I’m performing this method on the $this->post
class property. This is an instance of the PostRepository
that I passed in through the constructor:
/**
* Construct
*
* @param Illuminate\Database\Eloquent\Model $user
* @param PostRepository $post
*/
public function __construct(Model $user, PostRepository $post)
{
$this->user = $user;
$this->post = $post;
}
I inject the PostRepository
through the ServiceProvider:
/**
* User Repository
*
* @return Cribbb\Repository\User\EloquentUserRepository
*/
$this->app->bind("Cribbb\Repository\User\UserRepository", function ($app) {
$user = new EloquentUserRepository(
new User(),
$app->make("Cribbb\Repository\Post\PostRepository")
);
return new CacheDecorator($user, new LaravelCache($app["cache"], "user"));
});
You don’t have to represent each table with a single Repository. In reality you will face a lot of situations like this where you need an instance of another Repository in order to perform an action involving multiple tables.
And finally I can add the feed to the CacheDecorator
so the database is not hit on every request:
/**
* Feed
*
* @param int $id
* @return Illuminate\Database\Eloquent\Collection
*/
public function feed($id)
{
$key = md5('feed.'.$id);
if ($this->cache->has($key)) {
return $this->cache->get($key);
}
$feed = $this->user->feed($id);
$this->cache->put($key, $feed);
return $feed;
}
If you are having trouble understand what is going on in any of the above sections, refer back to previous posts where I explain the architecture and reasoning behind these methods. You can find a list of previous posts at the bottom of this post.
The process of adding a complex feature like a News Feed can seem daunting at first, but if you just take it one step at a time it’s often not as complicated as you first think.
In this tutorial I took each stage a broke it down so I was sure I was getting the intended result. Before writing any code it’s important to ensure that you are using the correct basic database query to pull the results you want. If you don’t get this first step right you will waste a lot of time wondering why your code isn’t producing the right outcome.
This is a pretty basic implementation of a News Feed. In reality we will need to paginate the results, cache the pagination, drop the cache when a new post is created and probably pull more data than just the post such as comments or likes. But as a first attempt I think this is a pretty good foundation to build all of those extra bits of functionality.
In next week’s tutorial I will start looking at pagination and how you can implement it to reduce the stress on your database and to offer a better user experience for your users.
This is a series of posts on building an entire Open Source application called Cribbb. All of the tutorials will be free to web, and all of the code is available on GitHub.