More broadly on EAV
There are lots of EAV (entity, attribute, value) definitions on the Internet, but not all of them explain its basic assumptions clearly. Therefore, beginning developers often find it hard to grasp.
The EAV model is a model that implements assumptions of database normalization. The aim of data normalization consists, among others, in eliminating repeated data from the relational database. In case of working with a lot of various data and of systematic DB growth, it is necessary to disperse information into many smaller tables. During a query, these tables are being combined using foreign keys, which reflects the data atomization postulate. As a result of the normalization process, the database scheme changes from a simple to more complex one. This in turn leads to reducing DB productivity on the one hand, and on the other to increasing data security while also reducing the risk of discrepancies.
In light of the above, we could briefly say that EAV is a deliberate change of the database scheme in order to break down the data into mutually dependent, atomic, unique parts that constitute one whole.
How does Magento implement EAV?
Each entity has a number of attributes. Each attribute is represented by a particular type of data. A given type of data is related to a specific table. MySQL has many types of data, which requires creating corresponding tables.
Broadly speaking, a given entity (e.g. a product or an order) is related to a number of attributes (e.g. name, creation date, price, colour). In MySQL, the attributes refer to types of data, such as varchar, text, int, datetime, decimal, etc. In EAV normalization means breaking down values of these attributes to particular tables, which group them according to their types. Therefore, tables with suffixes varchar or text will contain text data, with suffix datetime – dates, with suffix decimal – numerical values, and so on.
Summing up the above, we can say that:
- An entity represents single data in Magento objects; each entity is stored in DB and has a unique identifier.
- An attribute represents objects’ properties; particular attributes are not placed in the columns of the entity’s (e.g. product’s) table, instead they are linked only with the entity and grouped in specific tables in accordance with their types; as a result, there is a summary table, which stores information on attributes of all entities (if you’re interested, it’s catalog_eav_attribute).
- Value – a normal value of the attribute.
If you understand the above, you’ll see that despite of its architectural complexity, Magento offers a great deal of flexibility in modelling the entities (e.g. configuring the above-mentioned product). It allows for adding and deleting new attributes without the necessity to use the risky ALTER TABLE query, which leads to horizontal growth/reduction of tables by inserting or eliminating subsequent columns. As a result, this could lead to discrepancies and bugs in the DB scheme in the long term. By using EAV instead of ALTER TABLE to modify columns, we allow for safe vertical growth of tables (e.g. for varchar).
Currently, the EAV model in Magento supports the following entities:
- client,
- client’s address,
- product,
- category,
- order,
- invoice,
- credit notes,
- shipping.
According to Alan Storm, „All the available attributes in the entire system (not just for products) are stored in eav_attribute, and the actual attribute values are stored in tables with names like catalog_product_entity_attribute_varchar, catalog_product_entity_attribute_decimal, catalog_product_entity_attribute_etc”.
How does ORM help in handling EAV?
In general, in EAV data are fragmented (atomized), so reading a single record requires at least several joins.
ORM makes the queries to DB very simple. For example, let’s assume that we want to load products from category 13 with selection of the attribute “name”, the enabled products, visible in folders and search results, sorted by price.
We simply have to use the following methods that will do the whole job for us:
$_products = Mage::getModel('catalog/category')
->load(13) // load category by ID
->getProductCollection() // get collection
->addAttributeToSelect('name') // select product name attribute
->addAttributeToFilter('status', 1) // product enabled
->addAttributeToFilter('visibility', 4) // visibility in catalog,search
->setOrder('price', 'ASC'); //sets the order by price
print_r($_products);
That’s it! The work is done and the product collection can be handled further.
However, what if we want to handle EAV using SQL and write the whole query manually? Well, just do the following:
echo $_products->getSelect()->__toString();
Result:
SELECT `e`.*, `cat_index`.`position` AS `cat_index_position`, IF(at_status.value_id > 0, at_status.value, at_status_default.value) AS `status`, IF(at_visibility.value_id > 0, at_visibility.value, at_visibility_default.value) AS `visibility`, `price_index`.`price`, `price_index`.`tax_class_id`, `price_index`.`final_price`, IF(price_index.tier_price IS NOT NULL, LEAST(price_index.min_price, price_index.tier_price), price_index.min_price) AS `minimal_price`, `price_index`.`min_price`, `price_index`.`max_price`, `price_index`.`tier_price` FROM `catalog_product_entity` AS `e` INNER JOIN `catalog_category_product_index` AS `cat_index` ON cat_index.product_id=e.entity_id AND cat_index.store_id=1 AND cat_index.category_id = '13' INNER JOIN `catalog_product_entity_int` AS `at_status_default` ON (`at_status_default`.`entity_id` = `e`.`entity_id`) AND (`at_status_default`.`attribute_id` = '96') AND `at_status_default`.`store_id` = 0 LEFT JOIN `catalog_product_entity_int` AS `at_status` ON (`at_status`.`entity_id` = `e`.`entity_id`) AND (`at_status`.`attribute_id` = '96') AND (`at_status`.`store_id` = 1) INNER JOIN `catalog_product_entity_int` AS `at_visibility_default` ON (`at_visibility_default`.`entity_id` = `e`.`entity_id`) AND (`at_visibility_default`.`attribute_id` = '102') AND `at_visibility_default`.`store_id` = 0 LEFT JOIN `catalog_product_entity_int` AS `at_visibility` ON (`at_visibility`.`entity_id` = `e`.`entity_id`) AND (`at_visibility`.`attribute_id` = '102') AND (`at_visibility`.`store_id` = 1) INNER JOIN `catalog_product_index_price` AS `price_index` ON price_index.entity_id = e.entity_id AND price_index.website_id = '1' AND price_index.customer_group_id = 0 WHERE (IF(at_status.value_id > 0, at_status.value, at_status_default.value) = '1') AND (IF(at_visibility.value_id > 0, at_visibility.value, at_visibility_default.value) = '4') ORDER BY `price_index`.`min_price` ASC
That’s right, the query to DB is very complex! Joins, joins, and joins again… That can give you a headache!
I hope that you understood from the above example what is the idea behind EAV in a Magento database and how important is ORM in supporting it.
Summary
From the perspective of the DB structure, EAV is a model designed for applications which operate on a large amount of various, systematically growing data, such as e.g. web store data. EAV changes the database scheme by atomizing the data and dispersing it into many tables. EAV has the undeniable advantages of flexibility, security and coherence of data. Disadvantages include reduction of the system performance in case of reading large collections of objects and the necessity to create very complex queries to DB.
We can easily circumvent the need to write complex database queries for EAV using the ORM, which is, so to speak, a layer connecting the layer of the application’s business logic with the database. In just a few lines of code we can order Magento to read, modify or delete particular data in the DB, e.g. data about a product.
I hope that this article was helpful in understanding the ORM and EAV models in Magento. If you crave for additional knowledge, you might get more information from the sources below.
Good luck!
Sources