[MySQL] Parse Top Level Domain from URL, 리퍼러
페이지 정보
작성자 서방님 댓글 0건 조회 3,112회 작성일 12-09-26 16:06본문
출처 : http://blog.netnerds.net/2007/01/t-sql-parse-top-level-domain-from-url/
I keep track of all hits to my website RealCajunRecipes.com in a SQL table called hitcounter which has columns for the user's IP, browser, referring URL and the date. Recently, I saw a surge in traffic and wanted to know which domains were sending the traffic our way. After getting tired of issuing ad-hoc queries that included WHERE clauses like "where referer like '%google%'", I created a SQL Server user-defined function (UDF) to extract the domain from the referring URL.
CREATE FUNCTION [dbo].[parseURL] (@strURL varchar(1000))
RETURNS varchar(1000)
AS
BEGIN
IF CHARINDEX('http://',@strURL) > 0 OR CHARINDEX('https://',@strURL) > 0
-- Ghetto-tastic
SELECT @strURL = REPLACE(@strURL,'https://','')
SELECT @strURL = REPLACE(@strURL,'http://','')
SELECT @strURL = REPLACE(@strURL,'www','')
-- Remove everything after "/" if one exists
IF CHARINDEX('/',@strURL) > 0 (SELECT @strURL = LEFT(@strURL,CHARINDEX('/',@strURL)-1))
-- Optional: Remove subdomains but differentiate between www.google.com and www.google.com.au
IF (LEN(@strURL)-LEN(REPLACE(@strURL,'.','')))/LEN('.') < 3 -- if there are less than 3 periods
SELECT @strURL = PARSENAME(@strURL,2) + '.' + PARSENAME(@strURL,1)
ELSE -- It's likely a google.co.uk, or google.com.au
SELECT @strURL = PARSENAME(@strURL,3) + '.' + PARSENAME(@strURL,2) + '.' + PARSENAME(@strURL,1)
RETURN @strURL
END
This script does the following:
1. Checks to see if the string is an URL
(example: str = http://www.search.google.com.au/?q=netnerds)
2. Removes http, https and www (str = search.google.com.au/?q=netnerds)
3. Removes everything after the slash (str = search.google.com.au)
4. Removes excessive subdomains (str = google.com.au)
The script isn't perfect; I saw things like mysearch.myway.com get by but it's good enough for general use. If you'd like to see the entire domain, just remove the 4 line chunk marked "Optional."
To call this using SQL, modify this sample script to suite your environment:
SELECT COUNT(*) as theCount, dbo.parseURL("referer) as referer FROM hitcounter
WHERE referer IS NOT NULL
GROUP BY dbo.parsedomain(referer)
ORDER BY thecount DESC
Your results should look something like this
11831 | google.com |
10542 | yahoo.com |
9101 | msn.com |
746 | google.ca |
624 | google.co.uk |
Note: NULLs aren't parsed and thsu won't kill this function..they'll just show up as NULL.
댓글목록
등록된 댓글이 없습니다.