SQL University : An Introduction to BI – Part I

November 23rd, 2009 govindsyadav No comments

Welcome to Class Guys , Well , we will be starting our week of by taking a sneak peek into SQL BI gearing up to the more elaborative and deep understanding about BI and its applications .

Before , just diving in , I would like to mention here , the useful resources , which are being referred here up in this article.

  1. SQL Server Books Online (http://msdn.microsoft.com/en-us/library/ms130214.aspx)
  2. Jorge had written about the Sql server books online , and its availability in his article ,which can be found on http://sqlchicken.com/2009/09/sql-university-basic-tools/

  3. http://www.sqlauthority.com

First of all , a general question I would like to answer about BI .

Q . What is BI and what exactly it is there for ?

Answer : BI Stands for Business Intelligence .

We’ll make it little bit mathematical

BI=Business Intelligence

BI = Business + Intelligence

      Business= (Enterprise , Customary,etc)

      Intelligence= Decision

                  Decision = Data and Analysis

                           Analysis = Skills + Processes + Technologies + Tools(applications)

BI = (Enterprise,Customary,etc) + (Data and analysis)

So, now from above equation its clear that BI Means Analysis of Enterprise/customary Data .

Now , just summing up the above equation in words , BI basically refers to Processes,Skills,Applications,Technologies that are use to support the Decision making Process.

BI Technologies : When we do talk about BI Technologies , then its not just about reporting , but its lot more than reporting .

Just to explain about why BI is lot more than reporting , I would like to state here an live example ( going out of boundaries of BI and sql server) , of ICEBERG . when we see an iceberg , its 10% above the sea water while 90% beneath , so that’s the same thing with Reports , reports are only the visualization of that 90% of what is beneath it i.e a lot things including , Data , OLAP , Text mining ,Querying , Statistical analysis ,Benchmarking ,Predictive Analysis .

Summing up the above we got to know a new equation

BI = Reporting + Data mining + OLAP +Text mining + Benchmarking + Statistical analysis + Predictive Analysis + Querying .

Its Well Said by Danny Siegel about BI , that an effective BI system provides corporations with “one version of the truth”.

What BI is for ?

BI is there for Success of a business , because it helps you in making decisions and when you are able to make correct , accurate decisions on time then only the success is yours . Because if BI is there , then you are handy with your statistical analysis and predictive analysis (future , foreseen analysis) and can better make your decisions to remain competitive and stay ahead .

Hopes , the question we had started off with is answered and we are quite clear about the concept of BI.

Now , another Question Comes out to mind .

Q. What is the History of BI  ? Where all it came from ?

Answer : A basic question , what is the history of BI , how it came into existence and from where it get evolved , is it something new born ??

No , its not something which is recently evolved but today it is used more and more that’s why we are wondering and taking it as some new born , but its nothing like that ,

In a 1958 article, IBM researcher Hans Peter Luhn used the term business intelligence. He defined intelligence as “the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal.”

In 1989 Howard Dresner (later a Gartner Group analyst) proposed BI as an umbrella term to describe “concepts and methods to improve business decision making by using fact-based support systems.”. It was not until the late 1990s that this usage was widespread.

Wow , so above we came to know that BI do exists there since from a very long time , but today it is widespread and is present everywhere ,no matter whatever the business is , whatever the technology is , if some system is there , decision is there , BI is THERE .

Now comes another question :

Q. If BI is so vast , then from where to start with .?

Answer : Anywhere , that totally depends upon your choice , you can go for data mining , OLAP ,Text mining , Reporting , that all is your part , you need to decide what best you can do , because there are lots of tools available under BI , but you had to choose which tools you wish to learn , use and master .

Another tricky question :

Q. What language(terminology) does BI Speaks ?

Answer : Sure , that’s very important to learn the BI Language(Terminology) , because unless and until we are familiar with BI Terminology , we won’t be able to get hands on it , so just a quick view of BI Terms :

Aggregation : Information stored in summarized form

Agent : Application (client ) who searched the data and sends the alert.

Alert : Message sent by agents

Attribute : Fields or columns in table ,report ,charts .

Cube : Known as multidimensional format used for viewing and analyzing data .The fundamental structure for data in a multidimensional (OLAP ) System.

Data : Reality , facts stored in and processed.

Data Base : Collection of data

Data Cleansing / Scrubbing : Removing erroneous data

Data Mart : Local Data Warehouse , Small subset of Data warehouse for one or few divisions

Data Migration : Process of moving of data from one environment to other

Data Mining : Digging deep the hidden relationship of data and attributes among each other

Data Transformation : Process after data migration is done ,i.e the step of modifying the data as per the new data warehouse.

Data Warehouse : A storage of Enterprise/ customary data meant for to be analyzed .

Drill down and drill up : Process to move across hierarchy used in OLAP.

ETL : Extract , transform and load makes ETL . i.e get the data , transform it(modify) and migrate it .

DTS : its below ETL , because it’s a tool for ETL .

Star Schema : The simplified database as per the normalized forms of database management system

Fact Table : the table containing :

  1. The fields storing the foreign keys which connect each particular number to the appropriate value in each dimension.
  2. The fields storing the individual number, measure or any numeric value. Example(discount,price,etc)

Measure : A numeric value in fact table

Multidimensional Analysis / OLAP =Online Analytical processing, the technique to process and fetch out results in different –different dimensions of data i.e cubes .(widespread use of graphs and spreadsheets )

Slice : Tool for OLAP Viewing of data through multi dimensions example : Where clause in OLAP.

Dice : Tool for OLAP Viewing of data through multi dimensions

Replication : Replica of one data at another place.

Schema : Logical arrangement of data within database.

MDX : Multidimensional expressions, the query language for OLAP ,

Hopes , now ,as we have discussed couple of basic questions about BI ,we are ready with to get dive into the BI sea , but before getting in it , we’ll have a break and will continue on this interactive session in the same manner (if nobody complains of this is not a good idea to learn about what BI is .?) .

While writing this article , I was just wondering up from where to start this article on , because it is such a vast area to discuss on with that I cannot see the boundaries of , then thought of , why not to make it more interactive in an question / answer manner . I myself started asking questions to me as a layman and then tried to answer the questions .Hope you’ll enjoy this question answer class of BI , not to mention here , that if you do have any question up there in mind that is not covered up or required some more lights , then do revert back , I am waiting for your questions and would try my best to answer them .

You can reach me at govindsyadav[@]soliddotnet[dot]com

VN:F [1.6.3_896]
Rating: 4.5/5 (2 votes cast)

C# Using Statement – Try / Finally – IDisposable – Dispose() – SqlConnection – SqlCommand

November 17th, 2009 Neeraj Mathur 1 comment

While viewing some of the C# code written by a new programmer, I noticed that they lack of calling Dispose() method on SqlConnection and SqlCommand objects. And the dabase code was not placed in try/finally blocks. This coding style is typical newbie style of development that everyone, including myself, attempted in the very beginning.

SqlConnection conn = new SqlConnection(connectionString);
SqlCommand cmd = new SqlCommand(commandString, con);
con.Open();
cmd.ExecuteNonQuery();
con.Close();

The problem in the above code is that SqlConnection and SqlCommand implement IDisposable, which means they could have unmanaged resources to cleanup and it is our job, to make sure Dispose() gets called on these classes after we are finished with them. And, because an exception could be raised if the database is unavailable, we need to make sure Dispose() gets called even in the case of an exception.

Its better to use the “using” keyword in C#. Internally, this generates a try / finally around the object being allocated and calls Dispose() for you. It saves you the hassle of manually creating the try / finally block and calling Dispose().

The new code would looking something like this:

using (SqlConnection con = new SqlConnection(connectionString))
{
    using (SqlCommand cmd = new SqlCommand(commandString, con))
   {
      con.Open();
      cmd.ExecuteNonQuery();
   }
}

This is essentially equivalent to the following:

SqlConnection con = null;
SqlCommand cmd = null;
try
{
     con = new SqlConnection(connectionString);
     cmd = new SqlCommand(commandString, cn);
     con.Open();
     cmd.ExecuteNonQuery();
}
finally
{
      if (null != cm);
         cmd.Dispose();
      if (null != cn)
         con.Dispose();
}

You may notice the lack of calling Close() on the SqlConnection class, con. Internally, Dispose() checks the status of the connection and closes it for you. Therefore, technically you don’t need to call Close() on the connection (con) as Dispose() will do it for you. In addition, Dispose() destroys the connection string of the SqlConnection class. Therefore, if you want to re-open the connection after calling Dispose() on con, you will have to re-establish the connection string. Not doing so will throw an exception.

VN:F [1.6.3_896]
Rating: 4.0/5 (1 vote cast)
Categories: ASP.Net, C Sharp Tags: , , ,

Understanding ASP.Net : A Complete Overview

November 8th, 2009 Neeraj Mathur 1 comment
VN:F [1.6.3_896]
Rating: 5.0/5 (1 vote cast)
Categories: ASP.Net Tags:

Advantages of using tableless CSS | XHTML Web Design Techniques

November 7th, 2009 Sumit Gilhotra No comments

Tables used to be the “in” thing when it came to designing web sites. As with many other things though, new techniques and methods standards are bound to emerge with time. When once tables were seen in most web pages, some experts now suggest that tables should be thrown out of the window.

In simple words, tableless web design is basically a method whereby page layout control is achieved without the use of HTML tables. Instead, text and other elements on a page are arranged using CSS (Cascading Style Sheets). This language is the brainchild of the W3C (World Wide Web Consortium). It was designed in such a way as to improve web accessibility as well as to make use of HTML for semantic purposes rather than presentational purposes.

One thing that has been making the headlines in the past year or so is the term SEO (Search Engine Optimization) techniques. With search engines such as Google and Yahoo making big waves in the information sector, web designers are scrambling to get on their good side. The one main goal of a web designer is to get his site on the top pages for search results. How is this achieved? By making one’s site search engine friendly.

Search engines make use of various techniques to index all existing web sites. Based on these various techniques, they assign a page rank to the web site. The higher the page rank, the more chances it will be high up in the search engine results. The higher up the site is in ranking, the more chances that people will visit the site. That means good news for the web site. That is where semantics and HTML come in.

So how about tables? Web designers who favor tables assert that they make the design process easier and less time consuming. More so, they assert that tables are more compatible with various web browsers. On the other hand, proponents of tableless formats assert that tables do not adhere to web standards and web accessibility.

Over the years, people have been trying to come up with web standards that are based on logic and that would make it easier for everyone involved to access web sites. Text readers, bots, mobile devices, and other elements were all taken into consideration. As such, the tableless format has come into popular use.

Why should you follow this standard? Here are a few reasons as to why you should go tableless:

. The current W3C standards dictate the use of tableless design.
. Practically all browsers in use today support CSS for controlling layouts. As such, your site will be compatible with most any browser.
. It is easier to make global changes to the layout with the use of CSS. That is, if the coding is properly done, of course.
. Web site accessibility for people with special needs is done more easily with the proper implementation of content into XHTML documents. In this case, CSS is used only for the layout and style.
. Unnecessary code is eliminated with the use of XHTML and CSS, making for a sleeker and more manageable code.
. Tableless formats make it easier for search engines to index a web site.

Though tableless formats are being widely used for page layout control, it does not necessarily mean that tables are not being used anymore. They are merely not optimal for presentation purposes.

VN:F [1.6.3_896]
Rating: 3.3/5 (3 votes cast)
Categories: ASP.Net Tags: , ,

SQL Server 2005 – Merge Replication

November 2nd, 2009 Sumit Gilhotra 1 comment


  • Introduction

Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency. Using replication, you can distribute data to different locations and to remote or mobile users over local and wide area networks, dial-up connections, wireless connections, and the Internet.

Replication is the process of sharing data between databases in different locations. Using replication, we can create copies of the database and share the copy with different users so that they can make changes to their local copy of database and later synchronize the changes to the source database.

Terminologies before getting started:

Microsoft SQL Server 2000 supports the following types of replication

Publisher is a server that makes the data available for subscription to other servers. In addition to that, publisher also identifies what data has changed at the subscriber during the synchronizing process. Publisher contains publication(s).

Subscriber is a server that receives and maintains the published data. Modifications to the data at subscriber can be propagated back to the publisher.

Distributor is the server that manages the flow of data through the replication system. Two types of distributors are present, one is remote distributor and the other one local distributor. Remote distributor is separate from publisher and is configured as distributor for replication. Local distributor is a server that is configured as publisher and distributor.

Agents are the processes that are responsible for copying and distributing data between publisher and subscriber. There are different types of agents supporting different types of replication.

Snapshot Agent is an executable file that prepares snapshot files containing schema and data of published tables and database objects, stores the files in the snapshot folder, and records synchronization jobs in the distribution database.

An article can be any database object, like Tables (Column filtered or Row filtered), Views, Indexed views, Stored Procedures, and User defined functions.

Publication is a collection of articles.

Subscription is a request for copy of data or database objects to be replicated.

arc

Replication Types

Microsoft SQL Server 2005 supports the following types of replication:

  • Snapshot Replication
  • Transactional Replication
  • Merge Replication
Snapshot Replication
    • Snapshot replication is also known as static replication. Snapshot replication copies and distributes data and database objects exactly as they appear at the current moment in time.
    • Subscribers are updated with complete modified data and not by individual transactions, and are not continuous in nature.
    • This type is mostly used when the amount of data to be replicated is small and data/DB objects are static or does not change frequently.
Transactional Replication
    • Transactional replication is also known as dynamic replication. In transactional replication, modifications to the publication at the publisher are propagated to the subscriber incrementally.
    • Publisher and the subscriber are always in synchronization and should always be connected.
    • This type is mostly used when subscribers always need the latest data for processing.
Merge replication

It allows making autonomous changes to replicated data on the Publisher and on the Subscriber. With merge replication, SQL Server captures all incremental data changes in the source and in the target databases, and reconciles conflicts according to rules you configure or using a custom resolver you create. Merge replication is best used when you want to support autonomous changes on the replicated data on the Publisher and on the Subscriber.

Replication agents involved in merge replication are snapshot agent and merge agent.

Implement merge replication if, changes are made constantly at the publisher and subscribing servers, and must be merged in the end.

By default, the publisher wins all conflicts that it has with subscribers because it has the highest priority. Conflict resolver can be customized

Before starting the replication process:

assume that we have 2 server:

  • EGYPT-AEID: is the publisher server ( contains HRatPublisher )
  • SPS: is the subscriber server ( contains HRatSubscriber )
    use SQL server Authentication mode for login

on the publisher database i created table: Employees with fields of (ID, Name, Salary) to replicate its data to the subscriber server.

i will use publisher as subscriber also

Note: Check that SQL Server Agent is running on the publisher and the subscriber

Steps:

  1. Open SQL Server Management Studio and login with SQL Server Authentication to configure Publishing, Subscribers, and Distribution

1

a- Configure the appropriate server as publisher or distributor.

2

b- Enable the appropriate database for merge replication

pub db

2- Create new local publication from DB-Server –> Replication –> Local Publications –> Right Click –> New Pub

bub1

then choose the database that contains the data or objects you want to replicate

image

then choose the replication type and then specify the SQL server versions that will be used by subscribers to that publication like SQL Server 2005, SQL mobile Edition, SQL for Win CE ….etc

bub4

after that manage the replication articles, data and db objects, by choosing objects to be replicated

Note: you can manage the replication properties for selected objects

bub5

Then add filters to published tables to optimize performance and then configure the snapshot agent

bub6

bub7

and configure the security for snapshot agent

bub8

finally rename the publication and click finish

bub9

3- create a new subscription for the created “MyPublication01” Publication by right click on MyPublication01 –> New Subscription

that to configure the “Merge Agent” for replication on the subscriber database

pub10

bub11

then choose one or more subscriber databases. you can add new SQL Server subscribers

image

then specify the Merge Agent security as mentioned above on “Agent Snapshot”

and so specify the synchronization schedule for each agent.

Schedules:

  • Run Continuously: add schedule times to be auto run continuously
  • Run on demand only: manually run the synchronization

image

and then next up to final step, then click finish

you can check the errors from “Replication Monitor” by right click on Local Replication –> Launch Replication Monitor

Advantages in Replication:

Users can avail the following advantages by using replication process:

  • Users working in different geographic locations can work with their local copy of data thus allowing greater autonomy.
  • Database replication can also supplement your disaster-recovery plans by duplicating the data from a local database server to a remote database server. If the primary server fails, your applications can switch to the replicated copy of the data and continue operations.
  • You can automatically back up a database by keeping a replica on a different computer. Unlike traditional backup methods that prevent users from getting access to a database during backup, replication allows you to continue making changes online.
  • You can replicate a database on additional network servers and reassign users to balance the loads across those servers. You can also give users who need constant access to a database their own replica, thereby reducing the total network traffic.
  • Database-replication logs the selected database transactions to a set of internal replication-management tables, which can then be synchronized to the source database. Database replication is different from file replication, which essentially copies files.
Replication Performance Tuning Tips:
  • By distributing partitions of data to different Subscribers.
  • When running SQL Server replication on a dedicated server, consider setting the minimum memory amount for SQL Server to use from the default value of 0 to a value closer to what SQL Server normally uses.
  • Don’t publish more data than you need. Try to use Row filter and Column filter options wherever possible as explained above.
  • Avoid creating triggers on tables that contain subscribed data.
  • Applications that are updated frequently are not good candidates for database replication.
  • For best performance, avoid replicating columns in your publications that include TEXT, NTEXT or IMAGE data types.

If you have any question pls let us know…

VN:F [1.6.3_896]
Rating: 0.0/5 (0 votes cast)